Compositions for the use in identification of fungi

ABSTRACT

The present invention provides compositions, kits and methods for rapid identification and quantification of fungi by molecular mass and base composition analysis.

FIELD

Provided herein are compositions, kits and methods for rapididentification and quantification of fungi by molecular mass and basecomposition analysis.

BACKGROUND

The diagnosis of invasive fungal infections (IFI) is a major unmetmedical need. As many as 15% of patients with allogeneic hematopoieticstem cell transplant develop IFI, mostly caused by Aspergillus. Patientswith prolonged neutropenia (due to any cause) or immunosuppression (dueto transplants or treatment with corticosteroids) are at particularlyhigh risk.

Currently, the diagnosis of IFI relies on a combination of clinical andlaboratory criteria. The criteria were developed as an internationalconsensus and the certainty of the diagnosis ranges from definite(detection of the fungus in tissue) to probable or possible. Becausedefinite diagnosis requires biopsy and visualization of the organism intissue, the majority of patients with IFI fall into the probable orpossible categories. Even when a histologic diagnosis is made, the moldcannot be definitively identified because molds grow as hyphae in tissueand do not form spores. Since the anti-fungal susceptibility ofdifferent genera and species differs, specific diagnosis is of greatclinical importance. There are now effective and relatively non-toxictherapies available for IFI, especially those caused by Aspergillusfumigatus. Thus, the diagnostic limitations have profound effects on thetreatment of IFI.

It is challenging to diagnose IFI non-invasively, as it is difficult tosuccessfully culture fungi from blood or respiratory secretions.Complicating the problem, fungi are common in the environment and, thus,a positive culture could be due to environmental contamination. Twonon-invasive assays are available. An assay for circulatinggalactomannan, a constituent of the Aspergillus cell wall, has recentlybeen approved by the FDA. As fungi other than Aspergillus do not havegalactomannan in their cell walls, a negative test does not exclude IFI.Furthermore, the sensitivity and specificity of the galactomannan assayvary greatly from study to study for reasons that include technicaldifferences in how the test is performed in the U.S. and Europe, theincidence of aspergillosis in the population tested, and low samplesize. A test is also available for circulating glucan, a component ofall fungal cell walls. This test should be of wider utility than thatfor galactomannan, although it has not been as widely studied as thegalactomannan assay.

Invasive candidiasis can occur in severely immunosuppressed individualsand in patients who have central venous catheters, especially if theyare on systemic antibiotics and/or parenteral nutrition. The diagnosisof invasive candidiasis currently rests primarily on detection of theorganism in blood cultures. In the correct clinical setting, positivecultures from respiratory secretions and urine raise the suspicion ofsystemic infection. The diagnosis of the species of Candida is importantbecause Candida albicans is much more susceptible to fluconazole thanother species of Candida, especially C. krusei. Species identificationof the organism can take up to 10 or more days, although the presence ofCandida can be determined fairly quickly (1-2 days).

There have been many attempts to develop a diagnostic test for fungalDNA. Blood and bronchoalveolar lavage fluid have been the main fluidsstudied. Although different DNA extraction methods, various target genesand primers, and a variety detection methods and analytical techniqueshave been used, none of the published techniques have shown a strongenough correlation with clinical diagnosis to establish any as apreferred approach.

SUMMARY

Provided herein are compositions, kits and methods for rapididentification and quantification of fungi by molecular mass and basecomposition analysis.

One embodiment provides an isolated oligonucleotide primer pair having aforward primer member and a reverse primer member. Preferably, theforward primer member and the reverse primer member are independently13-35 nucleobases in length and are configured to hybridize with atarget nucleic acid to generate an amplicon that is from about 45 toabout 200 nucleobases in length. In this preferred embodiment the targetnucleic acid is a fungi reference sequence. Fungi reference sequencesinclude, GenBank Accession No.: X53497.1 (gi No.: 2507); and GenBankAccession No.: X70659 (gi No.: 671812). In this preferred embodiment theforward and reverse primer members are individually configured to have70% or greater complementarity to the target sequence.

The isolated oligonucleotide primer pair are configured to generate anamplicon from a plurality of fungi bioagents, wherein at least two ofthe generated amplicons will have unique molecular masses when analyzedsuing mass spectrometry. The unique molecular masses identify individualfungi bioagents from the plurality of fungi bioagents. Identification isachieved by comparing the unique molecular masses to a database ofmolecular masses that are indexed to known fungi bioagents and to theprimer pairs used to generate the amplicon. Alternatively, basecompositions can be calculated from the molecular masses and the basecompositions are queried to a database comprising base compositionsindexed to fungi bioagents and to the primer pairs used to generate theamplicons.

Thus, in a further embodiment there is provided a method for theidentification of fungi bioagents using the isolated oligonucleotideprimer pairs. In the preferred embodiment of the method, at least oneisolated oligonucleotide primer pair is used to amplify nucleic acidfrom a sample. The sample is suspected of comprising nucleic acid fromone or more fungi bioagents. Each amplicon is analyzed using massspectrometry to determine the molecular mass of said amplicon.Alternatively, base compositions are calculated from the molecularmasses determined for the amplicons. The molecular mass and/or the basecomposition is then queried against a database of molecular massesand/or base compositions. A match between the experimental data and thedatabase data identifies

In one embodiment there is provided a database comprising molecular massand/or base composition data. In this embodiment, the molecular massand/or base composition data is indexed to fungi bioagents andoligonucleotide primer pairs. The database data represents the molecularmass and/or base composition results that are achieved by using aparticular primer pair on a particular known fungi bioagent. Thus, byindexing the molecular mass and/or base composition data with a primerpair and a known bioagent, the query scans the experimentally derivedmolecular mass or base composition data through a plurality of molecularmass or base composition database data for each of the correspondingoligonucleotide primer pairs until a match is found. The matchidentifies the bioagent in the sample. In a preferred embodiment thedatabase comprises base composition data.

In one embodiment there is a method for the identification of a fungibioagent in a sample comprising the step of experimentally generating amolecular mass or a base composition and comparing that molecular massor base composition to a molecular mass or base composition from atleast one known fungal bioagent wherein a match identifies the fungibioagent in said sample.

One embodiment is an oligonucleotide primer 14 to 35 nucleobases inlength having at least 70% sequence identity with SEQ ID NO: 12.

Another embodiment is an oligonucleotide primer 14 to 35 nucleobases inlength having at least 70% sequence identity with SEQ ID NO: 27.

Another embodiment is a composition of is an isolated oligonucleotideprimer pair including a forward primer member 14 to 35 nucleobases inlength having at least 70% sequence identity with SEQ ID NO: 12 and areverse primer member 14 to 35 nucleobases in length having at least 70%sequence identity with SEQ ID NO: 27.

One embodiment is an oligonucleotide primer 14 to 35 nucleobases inlength having at least 70% sequence identity with SEQ ID NO: 9.

Another embodiment is an oligonucleotide primer 14 to 35 nucleobases inlength having at least 70% sequence identity with SEQ ID NO: 24.

Another embodiment is a composition of is an oligonucleotide primer pairincluding an oligonucleotide primer 14 to 35 nucleobases in lengthhaving at least 70% sequence identity with SEQ ID NO: 9 and anoligonucleotide primer 14 to 35 nucleobases in length having at least70% sequence identity with SEQ ID NO: 24.

One embodiment is an oligonucleotide primer 14 to 35 nucleobases inlength having at least 70% sequence identity with SEQ ID NO: 10.

Another embodiment is an oligonucleotide primer 14 to 35 nucleobases inlength having at least 70% sequence identity with SEQ ID NO: 25.

Another embodiment is a composition of is an oligonucleotide primer pairincluding an oligonucleotide primer 14 to 35 nucleobases in lengthhaving at least 70% sequence identity with SEQ ID NO: 10 and anoligonucleotide primer 14 to 35 nucleobases in length having at least70% sequence identity with SEQ ID NO: 25.

One embodiment is an oligonucleotide primer 14 to 35 nucleobases inlength having at least 70% sequence identity with SEQ ID NO: 11.

Another embodiment is an oligonucleotide primer 14 to 35 nucleobases inlength having at least 70% sequence identity with SEQ ID NO: 26.

Another embodiment is a composition of is an oligonucleotide primer pairincluding an oligonucleotide primer 14 to 35 nucleobases in lengthhaving at least 70% sequence identity with SEQ ID NO: 11 and anoligonucleotide primer 14 to 35 nucleobases in length having at least70% sequence identity with SEQ ID NO: 26.

In some embodiments, either or both of the primer members of the primerpair contain at least one modified nucleobase such as 5-propynyluracilor 5-propynylcytosine for example.

In some embodiments, either or both of the primer members of the primerpair comprises at least one universal nucleobase such as inosine forexample.

In some embodiments, either or both of the primer members of the primerpair comprises at least one non-templated T residue on the 5′-end.

In some embodiments, either or both of the primer members of the primerpair comprises at least one non-template tag.

In some embodiments, either or both of the primer members of the primerpair comprises at least one molecular mass modifying tag.

In some embodiments, either or both of the primer members of the primerpair comprises at least one non-templated T nucleotide at the 5′ end ofsaid primer member.

Some embodiments are kits that contain the isolated oligonucleotidesprimer pair compositions. In some embodiments, each member of said oneor more isolated oligonucleotides primer pairs of the kit isindependently of a length of 14 to 35 nucleobases and has 70% to 100%sequence identity with the corresponding member from the group of primerpairs represented by SEQ ID NO: 9: SEQ ID NO: 24; SEQ ID NO: 10: SEQ IDNO: 25; SEQ ID NO: 11: SEQ ID NO: 26; and SEQ ID NO: 12: SEQ ID NO: 27.

Some embodiments of the kits contain at least one calibrationpolynucleotide.

Some embodiments of the kits contain at least one anion exchangefunctional group linked to a magnetic bead.

In some embodiments, there is provided primers and compositionscomprising isolated pairs of oligonucleotides primers, and kitscontaining the same, and methods for use in identification of fungi. Theprimers are configured to produce amplification products of DNA encodinggenes that have conserved and variable regions among fungi. Furtherprovided are compositions comprising isolated pairs of oligonucleotidesprimers and kits containing the same, which are configured to providespecies and sub-species characterization of fungi.

Some embodiments provide methods for determining the quantity of anunknown fungus in a sample. The sample is contacted with the compositiondescribed above and a known quantity of a calibration polynucleotidecomprising a calibration sequence. Nucleic acid from the unknown fungusin the sample is concurrently amplified with the composition describedabove and nucleic acid from the calibration polynucleotide in the sampleis concurrently amplified with the composition described above to obtaina first amplification product comprising a fungal identifying ampliconand a second amplification product comprising a calibration amplicon.The molecular mass and abundance for the fungal identifying amplicon andthe calibration amplicon is determined. The fungal identifying ampliconis distinguished from the calibration amplicon based on molecular mass,wherein comparison of fungal identifying amplicon abundance andcalibration amplicon abundance indicates the quantity of fungus in thesample. In some embodiments, the base composition of the fungalidentifying amplicon is determined.

In some embodiments, there are methods for detecting or quantifyingfungi by combining a nucleic acid amplification process with a massdetermination process. In some embodiments, such methods identify orotherwise analyze the fungi by comparing mass information from anamplification product with a calibration or control product. Suchmethods can be carried out in a highly multiplexed and/or parallelmanner allowing for the analysis of as many as 300 samples per 24 hourson a single mass measurement platform. The accuracy of the massdetermination methods in some embodiments permits allows for the abilityto discriminate between different fungi such as, for example, pathogenicfungi which are members of the phyla Zygomycota, Basidiomycota,Ascomycota, and Fungi incertae sedis. Pathogenic classes within thephylum Zygomycota include zygomycetes of which member species include,but are not necessarily limited to: Absidia corymbifera, Mucorcircinelloides, Mucor hiemalis, Rhizopus oryzae, and Rhizopusmicrosporus. Pathogenic classes within the phylum Basidomycota include,but are not necessarily limited to Ustilaginomycetes which includes thespecies Malassezia furfur, and Hymenomycetes which includes the memberspecies Cryptococcus neoformans, Trichosporon cutaneum, Trichosporon,asahii, and Trichosporon capitatum. Pathogenic classes within the phylumAscomycota include but are not necessarily limited to: Saccharomyceteswhich includes the species Clavispora lusitaniae, Candida albicans,Candida dubliniensis, Candida glabrata, Candida krusei, Candidaparapsilosis and Candida tropicalis, Eurotiales which includes thespecies Aspergillus flavus, Aspergillus fumigatus, Aspergillus niger,Aspergillus terreus, and Aspergillus oryzae, Ophiostomatales whichincludes the species Sporothrix schenckii, Onygenales which includes thespecies Microsporum audouini, Microsporum canis, Microsporum gypseum,Trichophyton mentagrophytes, Trichophyton rubrum, Trichophytontonsurans, Trichophyton violaceum, Ajellomyces dermatitidis,Coccidioides immitis, Epidermophyton floccosum, Histoplasma capsulatum,and Paracoccidioides brasiliensis, Ascomycota incertae sedis whichincludes the species Cladosporium werneckii, and Anamorphic Ascomycotawhich includes the species Penicillium marneffei, Fusarium oxysporum,Fusarium solani, Hortaea wemeckii, Paecilomyces lilacinus, Paecilomycesvariotii, scedosporium prolificans, scedosporium apiospermum, andMadurella grisea. Pathogenic classes within the phylum Fungi incertaesedis include, but are not necessarily limited to, Pneumocystidales,which includes the species Pneumocystis carinii.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing summary, as well as the following detailed description, isbetter understood when read in conjunction with the accompanyingdrawings which are included by way of example and not by way oflimitation.

FIG. 1: process diagram illustrating a representative primer pairselection process.

FIG. 2: process diagram illustrating an embodiment of the calibrationmethod.

FIG. 3: Series of mass spectra of amplification products of fungiproduced with primer pair number 3030 and exhibiting different molecularmasses and base compositions.

FIG. 4: Three dimensional base composition diagram representing basecompositions of amplification products of fungi produced with primerpair number 3030.

DEFINITIONS

As used herein, the term “abundance” refers to an amount. The amount maybe described in terms of concentration which are common in molecularbiology such as “copy number,” “pfu or plate-forming unit” which arewell known to those with ordinary skill. Concentration may be relativeto a known standard or may be absolute.

As used herein, the term “amplifiable nucleic acid” is used in referenceto nucleic acids that may be amplified by any amplification method. Itis contemplated that “amplifiable nucleic acid” also comprises “sampletemplate.”

As used herein the term “amplification” refers to a special case ofnucleic acid replication involving template specificity. It is to becontrasted with non-specific template replication (i.e., replicationthat is template-dependent but not dependent on a specific template).Template specificity is here distinguished from fidelity of replication(i.e., synthesis of the proper polynucleotide sequence) and nucleotide(ribo- or deoxyribo-) specificity. Template specificity is frequentlydescribed in terms of “target” specificity. Target sequences are“targets” in the sense that they are sought to be sorted out from othernucleic acid. Amplification techniques have been configured primarilyfor this sorting out. Template specificity is achieved in mostamplification techniques by the choice of enzyme. Amplification enzymesare enzymes that, under conditions they are used, will process onlyspecific sequences of nucleic acid in a heterogeneous mixture of nucleicacid. For example, in the case of Q.beta. replicase, MDV-1 RNA is thespecific template for the replicase (D. L. Kacian et al., Proc. Natl.Acad. Sci. USA 69:3038 [1972]). Other nucleic acid will not bereplicated by this amplification enzyme. Similarly, in the case of T7RNA polymerase, this amplification enzyme has a stringent specificityfor its own promoters (Chamberlin et al., Nature 228:227 [1970]). In thecase of T4 DNA ligase, the enzyme will not ligate the twooligonucleotides or polynucleotides, where there is a mismatch betweenthe oligonucleotide or polynucleotide substrate and the template at theligation junction (D. Y. Wu and R. B. Wallace, Genomics 4:560 [1989]).Finally, Taq and Pfu polymerases, by virtue of their ability to functionat high temperature, are found to display high specificity for thesequences bounded and thus defined by the primers; the high temperatureresults in thermodynamic conditions that favor primer hybridization withthe target sequences and not hybridization with non-target sequences (H.A. Erlich (ed.), PCR Technology, Stockton Press [1989]).

As used herein, the term “amplification reagents” refers to thosereagents (deoxyribonucleotide triphosphates, buffer, etc.), needed foramplification, excluding primers, nucleic acid template, and theamplification enzyme. Typically, amplification reagents along with otherreaction components are placed and contained in a reaction vessel (testtube, microwell, etc.).

As used herein, the term “anion exchange functional group” refers to apositively charged functional group capable of binding an anion throughan electrostatic interaction. The most well known anion exchangefunctional groups are the amines, including primary, secondary, tertiaryand quaternary amines.

As used herein, a “base composition” is the number of each nucleobase(for example, A, T, C and G) and nucleobase analogs. For example,amplification of nucleic acid of Neisseria meningitidis with a primerpair that produces an amplification product from nucleic acid of 23SrRNA that has a molecular mass (sense strand) of 28480.75124, from whicha base composition of A25 G27 C22 T18 is assigned from a list ofpossible base compositions calculated from the molecular mass usingstandard known molecular masses of each of the four nucleobases.Similarly, the same amplification product generated using nucleotideanalogs (for example 5-iodo-C) has a base composition of A25 G275-iodo-C22 T18.

As used herein, a “base composition probability cloud” is arepresentation of the diversity in base composition resulting from avariation in sequence that occurs among different isolates of a givenspecies. The “base composition probability cloud” represents the basecomposition constraints for each species and is typically visualizedusing a pseudo four-dimensional plot.

As used herein, a “bioagent” is any organism, cell, or virus, living ordead, or a nucleic acid derived from such an organism, cell or virus.Examples of bioagents include, but are not limited, to cells, (includingbut not limited to human clinical samples, bacterial cells and otherpathogens), viruses, fungi, protists, and parasites. Samples may bealive or dead or in a vegetative state (for example, vegetative bacteriaor spores) and may be encapsulated or bioengineered. As used herein, a“pathogen” is a bioagent which causes a disease or disorder.

As used herein, a “bioagent division” is defined as group of bioagentsabove the species level and includes but is not limited to, orders,families, classes, clades, genera or other such groupings of bioagentsabove the species level.

As used herein, the term “bioagent identifying amplicon” refers to apolynucleotide that is amplified from a bioagent in an amplificationreaction and which 1) provides sufficient variability to distinguishbioagents from one another to a significant level and 2) whose molecularmass is amenable to molecular mass determination methods such as massspectrometry for example.

As used herein, the term “biological product” refers to any productoriginating from an organism. Biological products are often products ofprocesses of biotechnology. Examples of biological products include, butare not limited to: cultured cell lines, cellular components,antibodies, proteins and other cell-derived biomolecules, growth media,growth harvest fluids, natural products and bio-pharmaceutical products.

The terms “biowarfare agent” and “bioweapon” are synonymous and refer toa bacterium, virus, fungus or protozoan that could be deployed as aweapon to cause bodily harm to individuals by military or terroristgroups.

The term “broad range survey primer pair” refers to a primer pairconfigured to produce bioagent identifying amplicons across differentbroad groupings of bioagents. For example, the ribosomal RNA-targetedprimer pairs are broad range survey primer pairs.

The term “calibration amplicon” refers to a nucleic acid segmentrepresenting an amplification product obtained by amplification of acalibration sequence with a pair of primers configured to produce abioagent identifying amplicon.

The term “calibration sequence” refers to a polynucleotide sequence towhich a given pair of primers hybridizes for the purpose of producing aninternal (i.e: included in the reaction) calibration standardamplification product for use in determining the quantity of a bioagentin a sample. The calibration sequence may be expressly added to anamplification reaction, or may already be present in the sample prior toanalysis.

The term “clade primer pair” refers to a primer pair configured toproduce bioagent identifying amplicons for species belonging to a cladegroup. A clade primer pair may also be considered as a speciating primerpair since it will have the capability of resolving species within theclade group.

The term “codon” refers to a set of three adjoined nucleotides (triplet)that codes for an amino acid or a termination signal.

As used herein, the term “codon base composition analysis,” refers todetermination of the base composition of an individual codon byobtaining a bioagent identifying amplicon that includes the codon. Thebioagent identifying amplicon will at least include regions of thetarget nucleic acid sequence to which the primers hybridize forgeneration of the bioagent identifying amplicon as well as the codonbeing analyzed, located between the two primer hybridization regions.

As used herein, the terms “complementary” or “complementarity” are usedin reference to polynucleotides (i.e., a sequence of nucleotides such asan oligonucleotide or a target nucleic acid) related by the base-pairingrules. For example, the sequence “5′-A-G-T-3′,” is complementary to thesequence “3′-T-C-A-5′.” Complementarity may be “partial,” in which onlysome of the nucleic acids' bases are matched according to the basepairing rules. Or, there may be “complete” or “total” complementaritybetween the nucleic acids. The degree of complementarity between nucleicacid strands has significant effects on the efficiency and strength ofhybridization between nucleic acid strands. This is of particularimportance in amplification reactions, as well as detection methods thatdepend upon binding between nucleic acids. Either term may also be usedin reference to individual nucleotides, especially within the context ofpolynucleotides. For example, a particular nucleotide within anoligonucleotide may be noted for its complementarity, or lack thereof,to a nucleotide within another nucleic acid strand, in contrast orcomparison to the complementarity between the rest of theoligonucleotide and the nucleic acid strand.

The primer members of an oligonucleotides primer pair are complementaryto the target nucleic acid. The primer members individually have 70% orgreater complementarity to a target. The primer members are configuredto hybridize with a plurality of fungi bioagents including, but notlimited to, the reference bioagent. Thus, the complementarity of aprimer member to the reference target or to any sample target to whichit hybridizes is preferably 70% or greater. 70% or greater means allwhole numbers from 70-100, as well as any fractions. So, for example, ifa forward primer member is 22 nucleobases long and has 17 complementarynucleobases to the a target, and a reverse primer member is 25nucleobases and has 20 complementary nucleobases to the target then theforward primer member is 77.3% complementary to the target and thereverse primer member is 80% complementary to the target. As usedherein, 77.3% is rounded to 77%, while 77.5% would have rounded to 78%.Those of ordinary skill in the art understand primer complementarity.Primer members having less than 100% complementarity to a targetcomprise nucleotide insertions, deletions and additions compared to a100% complementary primer member. 70% to 100% include the followingwhole numbers: 70% 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%,81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%,95%, 96%, 97%, 98%, 99%, 100%.

The term “complement of a nucleic acid sequence” as used herein refersto an oligonucleotide which, when aligned with the nucleic acid sequencesuch that the 5′ end of one sequence is paired with the 3′ end of theother, is in “antiparallel association.” Certain bases not commonlyfound in natural nucleic acids may be included in the nucleic acids, forexample, inosine and 7-deazaguanine. Complementarity need not beperfect; stable duplexes may contain mismatched base pairs or unmatchedbases. Those skilled in the art of nucleic acid technology can determineduplex stability empirically considering a number of variablesincluding, for example, the length of the oligonucleotide, basecomposition and sequence of the oligonucleotide, ionic strength andincidence of mismatched base pairs. Where a first oligonucleotide iscomplementary to a region of a target nucleic acid and a secondoligonucleotide has complementary to the same region (or a portion ofthis region) a “region of overlap” exists along the target nucleic acid.The degree of overlap will vary depending upon the extent of thecomplementarity

As used herein, the term “division-wide primer pair” refers to a primerpair configured to produce bioagent identifying amplicons withinsections of a broad spectrum of bioagents. For example, a primer pairconfigured to produce bioagent identifying amplicons for thebeta-proteobacteria division of bacteria.

As used herein, the term “concurrently amplifying” used with respect tomore than one amplification reaction refers to the act of simultaneouslyamplifying more than one nucleic acid in a single reaction mixture.

As used herein, the term “drill down primer pair” refers to a primerpair configured to produce bioagent identifying amplicons foridentification of sub-species characteristics.

The term “duplex” refers to the state of nucleic acids in which the baseportions of the nucleotides on one strand are bound through hydrogenbonding the their complementary bases arrayed on a second strand. Thecondition of being in a duplex form reflects on the state of the basesof a nucleic acid. By virtue of base pairing, the strands of nucleicacid also generally assume the tertiary structure of a double helix,having a major and a minor groove. The assumption of the helical form isimplicit in the act of becoming duplexed.

As used herein, the term “etiology” refers to the causes or origins, ofdiseases or abnormal physiological conditions.

The term “gene” refers to a DNA sequence that comprises control andcoding sequences necessary for the production of an RNA having anon-coding function (e.g., a ribosomal or transfer RNA), a polypeptideor a precursor. The RNA or polypeptide can be encoded by a full lengthcoding sequence or by any portion of the coding sequence so long as thedesired activity or function is retained.

The terms “homology,” “homologous” and “sequence identity” refer to adegree of identity. There may be partial homology or complete homology.A partially homologous sequence is one that is less than 100% identicalto another sequence. Determination of sequence identity is described inthe following example: a primer 20 nucleobases in length which isotherwise identical to another 20 nucleobase primer but having twonon-identical residues has 18 of 20 identical residues (18/20=0.9 or 90%sequence identity). In another example, a primer 15 nucleobases inlength having all residues identical to a 15 nucleobase segment of aprimer 20 nucleobases in length would have 15/20=0.75 or 75% sequenceidentity with the 20 nucleobase primer. Percentages can be whole numbersof fractions, as described above. Sequence identity is meant to beproperly determined when the query sequence and the subject sequence areboth described and aligned in the 5′ to 3′ direction. Sequence alignmentalgorithms such as BLAST, will return results in two different alignmentorientations. In the Plus/Plus orientation, both the query sequence andthe subject sequence are aligned in the 5′ to 3′ direction. On the otherhand, in the Plus/Minus orientation, the query sequence is in the 5′ to3′ direction while the subject sequence is in the 3′ to 5′ direction. Itshould be understood that with respect to the primers of the presentinvention, sequence identity is properly determined when the alignmentis designated Plus/Plus. Sequence identity may also encompass alternateor modified nucleobases that perform in a functionally similar manner tothe regular nucleobases adenine, thymine, guanine and cytosine withrespect to hybridization and primer extension in amplificationreactions. In a non-limiting example, if the 5-propynyl pyrimidinespropyne C and/or propyne T replace one or more C or T residues in oneprimer which is otherwise identical to another primer in sequence andlength, the two primers will have 100% sequence identity with eachother. In another non-limiting example, Inosine (I) may be used as areplacement for G or T and effectively hybridize to C, A or U (uracil).Thus, if inosine replaces one or more C, A or U residues in one primerwhich is otherwise identical to another primer in sequence and length,the two primers will have 100% sequence identity with each other. Othersuch modified or universal bases may exist which would perform in afunctionally similar manner for hybridization and amplificationreactions and will be understood to fall within this definition ofsequence identity.

As used herein, “housekeeping gene” refers to a gene encoding a proteinor RNA involved in basic functions required for survival andreproduction of a bioagent. Housekeeping genes include, but are notlimited to genes encoding RNA or proteins involved in translation,replication, recombination and repair, transcription, nucleotidemetabolism, amino acid metabolism, lipid metabolism, energy generation,uptake, secretion and the like.

As used herein, the term “hybridization” is used in reference to thepairing of complementary nucleic acids. Hybridization and the strengthof hybridization (i.e., the strength of the association between thenucleic acids) is influenced by such factors as the degree ofcomplementary between the nucleic acids, stringency of the conditionsinvolved, and the T.sub.m of the formed hybrid. “Hybridization” methodsinvolve the annealing of one nucleic acid to another, complementarynucleic acid, i.e., a nucleic acid having a complementary nucleotidesequence. The ability of two polymers of nucleic acid containingcomplementary sequences to find each other and anneal through basepairing interaction is a well-recognized phenomenon. The initialobservations of the “hybridization” process by Marmur and Lane, Proc.Natl. Acad. Sci. USA 46:453 (1960) and Doty et al., Proc. Natl. Acad.Sci. USA 46:461 (1960) have been followed by the refinement of thisprocess into an essential tool of modem biology.

The term “isolated oligonucleotide primer pair,” “isolated primer pair,”“isolated primer,” “isolated primer pair member” or “isolatedoligonucleotide” refer to nucleic acid sequences that are substantiallypurified from components not of interest. Preferably, each of the primermembers of the oligonucleotide primer pair are chemically synthesizedand isolated using well-known techniques. Preferably, the isolatedoligonucleotide primer pairs are and are at least 60% free from saidcomponents not of interest. Those of ordinary skill understand that atleast 60% means 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%,72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%,86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or100% free from said components not of interest. In the most preferredembodiment each primer member is chemically synthesized and then eitherlyophilized or resuspended in an appropriate solution, such as Tris,EDTA (TE) buffer or HPLC water. Thus, an isolated oligonucleotide primerpair or an isolated primer pair member is purified from components notof interest.

The term “in silico” refers to processes taking place via computercalculations. For example, electronic PCR (ePCR) is a process analogousto ordinary PCR except that it is carried out using nucleic acidsequences and primer pair sequences stored on a computer formattedmedium.

As described herein, oligonucleotides primers are configured to bind toconserved sequence regions of a plurality of bioagent that flank anintervening variable region and, upon amplification, yield amplificationproducts which ideally provide enough variability to distinguishindividual bioagents in said plurality, and which are amenable tomolecular mass analysis. Preferably the conserved regions are highlyconserved. By the term “highly conserved,” it is meant that the sequenceregions exhibit between about 80-100%, or between about 90-100%, orbetween about 95-100% identity among all, or at least 70%, at least 80%,at least 90%, at least 95%, or at least 99% of species or strains.

The “ligase chain reaction” (LCR; sometimes referred to as “LigaseAmplification Reaction” (LAR) described by Barany, Proc. Natl. Acad.Sci., 88:189 (1991); Barany, PCR Methods and Applic., 1:5 (1991); and Wuand Wallace, Genomics 4:560 (1989) has developed into a well-recognizedalternative method for amplifying nucleic acids. In LCR, fouroligonucleotides, two adjacent oligonucleotides which uniquely hybridizeto one strand of target DNA, and a complementary set of adjacentoligonucleotides, that hybridize to the opposite strand are mixed andDNA ligase is added to the mixture. Provided that there is completecomplementarity at the junction, ligase will covalently link each set ofhybridized molecules. Importantly, in LCR, two probes are ligatedtogether only when they base-pair with sequences in the target sample,without gaps or mismatches. Repeated cycles of denaturation,hybridization and ligation amplify a short segment of DNA. LCR has alsobeen used in combination with PCR to achieve enhanced detection ofsingle-base changes. However, because the four oligonucleotides used inthis assay can pair to form two short ligatable fragments, there is thepotential for the generation of target-independent background signal.The use of LCR for mutant screening is limited to the examination ofspecific nucleic acid positions.

The term “locked nucleic acid” or “LNA” refers to a nucleic acidanalogue containing one or more 2′-O,4′-C-methylene-.beta.-D-ribofuranosyl nucleotide monomers in an RNAmimicking sugar conformation. LNA oligonucleotides display unprecedentedhybridization affinity toward complementary single-stranded RNA andcomplementary single- or double-stranded DNA. LNA oligonucleotidesinduce A-type (RNA-like) duplex conformations.

As used herein, the term “mass-modifying tag” refers to any modificationto a given nucleotide which results in an increase in mass relative tothe analogous non-mass modified nucleotide. Mass-modifying tags caninclude heavy isotopes of one or more elements included in thenucleotide such as carbon-13 for example. Other possible modificationsinclude addition of substituents such as iodine or bromine at the 5position of the nucleobase for example.

The term “mass spectrometry” refers to measurement of the mass of atomsor molecules. The molecules are first converted to ions, which areseparated using electric or magnetic fields according to the ratio oftheir mass to electric charge. The measured masses are used to identitythe molecules.

The term “microorganism” as used herein means an organism too small tobe observed with the unaided eye and includes, but is not limited tobacteria, virus, protozoans, fungi; and ciliates.

The term “multi-drug resistant” or multiple-drug resistant” refers to amicroorganism which is resistant to more than one of the antibiotics orantimicrobial agents used in the treatment of said microorganism.

The term “multiplex PCR” refers to a PCR reaction where more than oneprimer set is included in the reaction pool allowing 2 or more differentDNA targets to be amplified by PCR in a single reaction tube.

The term “non-templated T” refers to a nucleotide residue that is addedto the 5′ end of a primer pair member. The non-templated T residue isnot complementary to the target nucleic acid sequence. The addition of anon-templated T residue has an effect of minimizing the addition ofnon-templated adenosine residues as a result of the non-specific enzymeactivity of Taq polymerase (Magnuson et al., Biotechniques, 1996, 21,700-709). Taq polymerase's non-specific enzyme activity will lead toambiguous results during molecular mass analysis.

The term “non-template tag” refers to a stretch of at least threeguanine or cytosine nucleobases of a primer used to produce a bioagentidentifying amplicon which are not complementary to the template. Anon-template tag is incorporated into a primer for the purpose ofincreasing the primer-duplex stability of later cycles of amplificationby incorporation of extra G-C pairs which each have one additionalhydrogen bond relative to an A-T pair.

The term “nucleic acid sequence” as used herein refers to the linearcomposition of the nucleic acid residues A, T, C or G or any analogs ormodifications thereof, within an oligonucleotide, nucleotide orpolynucleotide, and fragments or portions thereof, and to DNA or RNA ofgenomic or synthetic origin which may be single or double stranded, andrepresent the sense or antisense strand

As used herein, the term “nucleobase” is synonymous with other terms inuse in the art including “nucleotide,” “deoxynucleotide,” “nucleotideresidue,” “deoxynucleotide residue,” “nucleotide triphosphate (NTP),” ordeoxynucleotide triphosphate (dNTP).

The term “nucleotide analog” as used herein refers to modified ornon-naturally occurring nucleotides such as 5-propynyl pyrimidines(i.e., 5-propynyl-dTTP and 5-propynyl-dTCP), 7-deaza purines (i.e.,7-deaza-dATP and 7-deaza-dGTP). Nucleotide analogs include base analogsand comprise modified forms of deoxyribonucleotides as well asribonucleotides.

The term “oligonucleotide” as used herein is defined as a moleculecomprising two or more deoxyribonucleotides or ribonucleotides,preferably at least 5 nucleotides, more preferably at least about 13 to35 nucleotides. The exact size will depend on many factors, which inturn depend on the ultimate function or use of the oligonucleotide. Theoligonucleotide may be generated in any manner, including chemicalsynthesis, DNA replication, reverse transcription, PCR, or a combinationthereof. Because mononucleotides are reacted to make oligonucleotides ina manner such that the 5′ phosphate of one mononucleotide pentose ringis attached to the 3′ oxygen of its neighbor in one direction via aphosphodiester linkage, an end of an oligonucleotide is referred to asthe “5′-end” if its 5′ phosphate is not linked to the 3′ oxygen of amononucleotide pentose ring and as the “3′-end” if its 3′ oxygen is notlinked to a 5′ phosphate of a subsequent mononucleotide pentose ring. Asused herein, a nucleic acid sequence, even if internal to a largeroligonucleotide, also may be said to have 5′ and 3′ ends. A first regionalong a nucleic acid strand is said to be upstream of another region ifthe 3′ end of the first region is before the 5′ end of the second regionwhen moving along a strand of nucleic acid in a 5′ to 3′ direction. Alloligonucleotide primers disclosed herein are understood to be presentedin the 5′ to 3′ direction when reading left to right. When twodifferent, non-overlapping oligonucleotides anneal to different regionsof the same linear complementary nucleic acid sequence, and the 3′ endof one oligonucleotide points towards the 5′ end of the other, theformer may be called the “upstream” oligonucleotide and the latter the“downstream” oligonucleotide. Similarly, when two overlappingoligonucleotides are hybridized to the same linear complementary nucleicacid sequence, with the first oligonucleotide positioned such that its5′ end is upstream of the 5′ end of the second oligonucleotide, and the3′ end of the first oligonucleotide is upstream of the 3′ end of thesecond oligonucleotide, the first oligonucleotide may be called the“upstream” oligonucleotide and the second oligonucleotide may be calledthe “downstream” oligonucleotide.

As used herein, a “pathogen” is a bioagent which causes a disease ordisorder.

As used herein, the terms “PCR product,” “PCR fragment,” “amplicon” or“amplification product” refer to the resultant mixture of compoundsafter two or more cycles of the PCR steps of denaturation, annealing andextension are complete. These terms encompass the case where there hasbeen amplification of one or more segments of one or more targetsequences.

The term “peptide nucleic acid” (“PNA”) as used herein refers to amolecule comprising bases or base analogs such as would be found innatural nucleic acid, but attached to a peptide backbone rather than thesugar-phosphate backbone typical of nucleic acids. The attachment of thebases to the peptide is such as to allow the bases to base pair withcomplementary bases of nucleic acid in a manner similar to that of anoligonucleotide. These small molecules, also desigated anti gene agents,stop transcript elongation by binding to their complementary strand ofnucleic acid (Nielsen, et al. Anticancer Drug Des. 8:53 63).

The term “polymerase” refers to an enzyme having the ability tosynthesize a complementary strand of nucleic acid from a startingtemplate nucleic acid strand and free dNTPs.

As used herein, the term “polymerase chain reaction” (“PCR”) refers tothe method of K. B. Mullis U.S. Pat. Nos. 4,683,195, 4,683,202, and4,965,188, hereby incorporated by reference, that describe a method forincreasing the concentration of a segment of a target sequence in amixture of genomic DNA without cloning or purification. This process foramplifying the target sequence consists of introducing a large excess oftwo oligonucleotide primers to the DNA mixture containing the desiredtarget sequence, followed by a precise sequence of thermal cycling inthe presence of a DNA polymerase. The two primers are complementary totheir respective strands of the double stranded target sequence. Toeffect amplification, the mixture is denatured and the primers thenannealed to their complementary sequences within the target molecule.Following annealing, the primers are extended with a polymerase so as toform a new pair of complementary strands. The steps of denaturation,primer annealing, and polymerase extension can be repeated many times(i.e., denaturation, annealing and extension constitute one “cycle”;there can be numerous “cycles”) to obtain a high concentration of anamplified segment of the desired target sequence. The length of theamplified segment of the desired target sequence is determined by therelative positions of the primers with respect to each other, andtherefore, this length is a controllable parameter. By virtue of therepeating aspect of the process, the method is referred to as the“polymerase chain reaction” (hereinafter “PCR”). Because the desiredamplified segments of the target sequence become the predominantsequences (in terms of concentration) in the mixture, they are said tobe “PCR amplified.” With PCR, it is possible to amplify a single copy ofa specific target sequence in genomic DNA to a level detectable byseveral different methodologies (e.g., hybridization with a labeledprobe; incorporation of biotinylated primers followed by avidin-enzymeconjugate detection; incorporation of 32P-labeled deoxynucleotidetriphosphates, such as dCTP or dATP, into the amplified segment). Inaddition to genomic DNA, any oligonucleotide or polynucleotide sequencecan be amplified with the appropriate set of primer molecules. Inparticular, the amplified segments created by the PCR process itselfare, themselves, efficient templates for subsequent PCR amplifications.

The term “polymerization means” or “polymerization agent” refers to anyagent capable of facilitating the addition of nucleoside triphosphatesto an oligonucleotide. Preferred polymerization means comprise DNA andRNA polymerases.

As used herein, the terms “oligonucleotide primer pair,” “pair ofprimers,” or “primer pair” are used in reference to a composition with aforward primer member and a reverse primer member. The forward primerhybridizes to a sense strand of a target gene sequence to be amplifiedand primes synthesis of an antisense strand (complementary to the sensestrand) using the target sequence as a template. The reverse primerhybridizes to the antisense strand of a target gene sequence to beamplified and primes synthesis of a sense strand (complementary to theantisense strand) using the target sequence as a template.

The primers are configured to bind to conserved sequence regions of abioagent identifying amplicon that flank an intervening variable regionand yield amplification products which provide enough variability todistinguish each individual bioagent, and which are amenable tomolecular mass analysis. In some embodiments, the conserved sequenceregions exhibit between about 80-100%, or between about 90-100%, orbetween about 95-100% identity, or between about 99-100% identity. Thisincludes (fractions rounded as described): 80%, 81%, 82%, 83%, 84%, 85%,86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or100%. The molecular mass of a given amplification product provides ameans of identifying the bioagent from which it was obtained, due to thevariability of the variable region. Thus configuration of the primersrequires selection of a variable region with appropriate variability toresolve the identity of a given bioagent. Bioagent identifying ampliconsare ideally specific to the identity of the bioagent.

Properties of the primers may include any number of properties relatedto structure including, but not limited to: nucleobase length which maybe contiguous (linked together) or non-contiguous (for example, two ormore contiguous segments which are joined by a linker or loop moiety),modified or universal nucleobases (used for specific purposes such asfor example, increasing hybridization affinity, preventing non-templatedadenylation and modifying molecular mass) percent complementarity to agiven target sequences.

Properties of the oligonucleotide primer pairs also include functionalfeatures including, but not limited to, orientation of hybridization(forward or reverse) relative to a nucleic acid template. The coding orsense strand is the strand to which the forward priming primerhybridizes (forward priming orientation) while the reverse primingprimer hybridizes to the non-coding or antisense strand (reverse primingorientation). The functional properties of a given primer pair alsoinclude the generic template nucleic acid to which the primer pairhybridizes. For example, identification of bioagents can be accomplishedat different levels using primers suited to resolution of eachindividual level of identification. Broad range survey primers areconfigured with the objective of identifying a bioagent as a member of aparticular division (e.g., an order, family, genus or other suchgrouping of bioagents above the species level of bioagents). In someembodiments, broad range survey intelligent primers are capable ofidentification of bioagents at the species or sub-species level. Otherprimers may have the functionality of producing bioagent identifyingamplicons for members of a given taxonomic genus, clade, species,sub-species or genotype (including genetic variants which may includepresence of virulence genes or antibiotic resistance genes ormutations). Additional functional properties of primer pairs include thefunctionality of performing amplification either singly (single primerpair per amplification reaction vessel) or in a multiplex fashion(multiple primer pairs and multiple amplification reactions within asingle reaction vessel).

The term “reference sequence” refers to a fungi bioagent sequence thatis used for oligonucleotide primer pair naming. In the preferredembodiment, the nucleic acid sequences from a plurality of fungibioagents are aligned and conserved regions are identified (as describedherein). Primer pairs are configured such that the primer pairs willgenerate bioagent identifying amplicons from at least two of saidplurality of fungi bioagents. One of said plurality of fungi bioagentsin the alignment is used as a reference sequence, indicating theposition of the primer pair relative to that fungi bioagent. However,the primer pair is not necessarily fully complementary to the referencesequence.

The term “reverse transcriptase” refers to an enzyme having the abilityto transcribe DNA from an RNA template. This enzymatic activity is knownas reverse transcriptase activity. Reverse transcriptase activity isdesirable in order to obtain DNA from RNA viruses which can then beamplified and analyzed by the current methods.

The term “Ribosomal RNA” or “rRNA” refers to the primary ribonucleicacid constituent of ribosomes. Ribosomes are the protein-manufacturingorganelles of cells and exist in the cytoplasm. Ribosomal RNAs aretranscribed from the DNA genes encoding them.

The term “sample” in the present specification and claims is used in itsbroadest sense. On the one hand it is meant to include a specimen orculture (e.g., microbiological cultures). On the other hand, it is meantto include both biological and environmental samples. A sample mayinclude a specimen of synthetic origin. Biological samples may beanimal, including human, fluid, solid (e.g., stool) or tissue, as wellas liquid and solid food and feed products and ingredients such as dairyitems, vegetables, meat and meat by-products, and waste. Biologicalsamples may be obtained from all of the various families of domesticanimals, as well as feral or wild animals, including, but not limitedto, such animals as ungulates, bear, fish, lagamorphs, rodents, etc.Environmental samples include environmental material such as surfacematter, soil, water and industrial samples, as well as samples obtainedfrom food and dairy processing instruments, apparatus, equipment,utensils, disposable and non-disposable items. These examples are not tobe construed as limiting the sample types. The term “source of targetnucleic acid” refers to any sample that contains nucleic acids (RNA orDNA). Particularly preferred sources of target nucleic acids arebiological samples including, but not limited to blood, saliva, cerebralspinal fluid, pleural fluid, milk, lymph, sputum and semen.

As used herein, the term “sample template” refers to nucleic acidoriginating from a sample that is analyzed for the presence of “target”(defined below). In contrast, “background template” is used in referenceto nucleic acid other than sample template that may or may not bepresent in a sample. Background template is often a contaminant. It maybe the result of carryover, or it may be due to the presence of nucleicacid contaminants sought to be purified away from the sample. Forexample, nucleic acids from organisms other than those to be detectedmay be present as background in a test sample.

A “segment” or “region” is defined herein as an area of nucleic acidwithin a larger nucleic acid sequence. Regions are typically called outsuing nucleotide numbers of the larger sequence. An example of a regionis nucleotides 697 to 1129 of GenBank Accession No.: X70659.1 (gi No.:671812). Isolated oligonucleotides configured to hybridize within thisregion are useful for identification of fungi bioagents used the methodsdescribed herein. Other examples of regions can include, but are notlimited to, exons, genes, VNTRs and conserved regions identified in analignment of bioagents.

The “self-sustained sequence replication reaction” (3SR) (Guatelli etal., Proc. Natl. Acad. Sci., 87:1874-1878 [1990], with an erratum atProc. Natl. Acad. Sci., 87:7797 [1990]) is a transcription-based invitro amplification system (Kwok et al., Proc. Natl. Acad. Sci.,86:1173-1177 [1989]) that can exponentially amplify RNA sequences at auniform temperature. The amplified RNA can then be utilized for mutationdetection (Fahy et al., PCR Meth. Appl., 1:25-33 [1991]). In thismethod, an oligonucleotide primer is used to add a phage RNA polymerasepromoter to the 5′ end of the sequence of interest. In a cocktail ofenzymes and substrates that includes a second primer, reversetranscriptase, RNase H, RNA polymerase and ribo- and deoxyribonucleosidetriphosphates, the target sequence undergoes repeated rounds oftranscription, cDNA synthesis and second-strand synthesis to amplify thearea of interest. The use of 3SR to detect mutations is kineticallylimited to screening small segments of DNA (e.g., 200-300 base pairs).

As used herein, the term ““sequence alignment”” refers to a listing ofmultiple DNA or amino acid sequences and aligns them to highlight theirsimilarities. The listings can be made using bioinformatics computerprograms.

As used herein, the term “speciating primer pair” refers to a primerpair configured to produce a bioagent identifying amplicon with thediagnostic capability of identifying species members of a group ofgenera or a particular genus of bioagents.

As used herein, the term “species confirmation primer pair” refers to aprimer pair configured to produce a bioagent identifying amplicon withthe diagnostic capability to unambiguously produce a unique basecomposition to identify a particular species of bioagent.

As used herein, a “sub-species characteristic” is a geneticcharacteristic that provides the means to distinguish two members of thesame bioagent species. For example, one fungal strain could bedistinguished from another fungal strain of the same species bypossessing a genetic change (e.g., for example, a nucleotide deletion,addition or substitution) in one of the fungal genes, such as the largesubunit ribosomal RNA.

As used herein, the term “target,” refers to a nucleic acid sequence orstructure to be detected or characterized. Thus, the “target” is soughtto be sorted out from other nucleic acid sequences and contains asequence that has at least partial complementarity with anoligonucleotide primer. The target nucleic acid may comprise single- ordouble-stranded DNA or RNA. A “segment” is defined as a region ofnucleic acid within the target sequence.

The term “template” refers to a strand of nucleic acid on which acomplementary copy is built from nucleoside triphosphates through theactivity of a template-dependent nucleic acid polymerase. Within aduplex the template strand is, by convention, depicted and described asthe “bottom” strand. Similarly, the non-template strand is oftendepicted and described as the “top” strand.

As used herein, the term “T.sub.m” is used in reference to the “meltingtemperature.” The melting temperature is the temperature at which apopulation of double-stranded nucleic acid molecules becomes halfdissociated into single strands. Several equations for calculating theT.sub.m of nucleic acids are well known in the art. As indicated bystandard references, a simple estimate of the T.sub.m value may becalculated by the equation: T.sub.m=81.5+0.41(% G+C), when a nucleicacid is in aqueous solution at 1 M NaCl (see e.g., Anderson and Young,Quantitative Filter Hybridization, in Nucleic Acid Hybridization (1985).Other references (e.g., Allawi, H. T. & SantaLucia, J., Jr.Thermodynamics and NMR of internal G.T mismatches in DNA. Biochemistry36, 10581-94 (1997) include more sophisticated computations which takestructural and environmental, as well as sequence characteristics intoaccount for the calculation of T.sub.m.

The term “triangulation genotyping analysis” refers to a method ofgenotyping a bioagent by measurement of molecular masses or basecompositions of amplification products, corresponding to bioagentidentifying amplicons, obtained by amplification of regions of more thanone gene. In this sense, the term “triangulation” refers to a method ofestablishing the accuracy of information by comparing three or moretypes of independent points of view bearing on the same findings.Triangulation genotyping analysis carried out with a plurality oftriangulation genotyping analysis primers yields a plurality of basecompositions that then provide a pattern or “barcode” from which aspecies type can be assigned. The species type may represent apreviously known sub-species or strain, or may be a previously unknownstrain having a specific and previously unobserved base compositionbarcode indicating the existence of a previously unknown genotype.

As used herein, the term “triangulation genotyping analysis primer pair”is a primer pair configured to produce bioagent identifying ampliconsfor determining species types in a triangulation genotyping analysis.

The employment of more than one bioagent identifying amplicon foridentification of a bioagent is herein referred to as “triangulationidentification.” Triangulation identification is pursued by analyzing aplurality of bioagent identifying amplicons selected within multiplecore genes. This process is used to reduce false negative and falsepositive signals, and enable reconstruction of the origin of hybrid orotherwise engineered bioagents. For example, identification of the threepart toxin genes typical of B. anthracis (Bowen et al., J. Appl.Microbiol., 1999, 87, 270-278) in the absence of the expected signaturesfrom the B. anthracis genome would suggest a genetic engineering event.

As used herein, the term “unknown bioagent” may mean either: (i) abioagent whose existence is known (such as the well known bacterialspecies Staphylococcus aureus for example) but which is not known to bein a sample to be analyzed, or (ii) a bioagent whose existence is notknown (for example, the SARS coronavirus was unknown prior to April2003). For example, if the method for identification of coronavirusesdisclosed in commonly owned U.S. Patent Application Publication No.:US2005-0266397 (incorporated herein by reference in its entirety) was tobe employed prior to April 2003 to identify the SARS coronavirus in aclinical sample, both meanings of “unknown” bioagent are applicablesince the SARS coronavirus was unknown to science prior to April, 2003and since it was not known what bioagent (in this case a coronavirus)was present in the sample. On the other hand, if the method of U.S.Patent Application Publication No.: US2005-0266397 was to be employedsubsequent to April 2003 to identify the SARS coronavirus in a clinicalsample, only the first meaning (i) of “unknown” bioagent would applysince the SARS coronavirus became known to science subsequent to April2003 and since it was not known what bioagent was present in the sample.

The term “variable sequence” as used herein refers to differences innucleic acid sequence between two nucleic acids. For example, the genesof two different bacterial species may vary in sequence by the presenceof single base substitutions and/or deletions or insertions of one ormore nucleotides. These two forms of the structural gene are said tovary in sequence from one another. As used herein, “viral nucleic acid”includes, but is not limited to, DNA, RNA, or DNA that has been obtainedfrom viral RNA, such as, for example, by performing a reversetranscription reaction. Viral RNA can either be single-stranded (ofpositive or negative polarity) or double-stranded.

The term “virus” refers to obligate, ultramicroscopic, parasitesincapable of autonomous replication (i.e., replication requires the useof the host cell's machinery). Viruses can survive outside of a hostcell but cannot replicate.

The term “wild-type” refers to a gene or a gene product that has thecharacteristics of that gene or gene product when isolated from anaturally occurring source. A wild-type gene is that which is mostfrequently observed in a population and is thus arbitrarily designatedthe “normal” or “wild-type” form of the gene. In contrast, the term“modified”, “mutant” or “polymorphic” refers to a gene or gene productthat displays modifications in sequence and or functional properties(i.e., altered characteristics) when compared to the wild-type gene orgene product. It is noted that naturally-occurring mutants can beisolated; these are identified by the fact that they have alteredcharacteristics when compared to the wild-type gene or gene product.

As used herein, a “wobble base” is a variation in a codon found at thethird nucleotide position of a DNA triplet. Variations in conservedregions of sequence are often found at the third nucleotide position dueto redundancy in the amino acid code.

DETAILED DESCRIPTION OF EMBODIMENTS A. Bioagent Identifying Amplicons

Provided are methods for detection and identification of unknownbioagents using bioagent identifying amplicons. Primers are selected tohybridize to conserved sequence regions of nucleic acids derived from abioagent, and which bracket variable sequence regions to yield abioagent identifying amplicon, which can be amplified and which isamenable to molecular mass determination. The molecular mass thenprovides a means to uniquely identify the bioagent without a requirementfor prior knowledge of the possible identity of the bioagent. Themolecular mass or corresponding base composition signature of theamplification product is then matched against a database of molecularmasses or base composition signatures. The molecular masses or basecompositions in a database are indexed to a bioagent and a primer pair.The primer pair corresponds with the primer pair that is usedexperimentally in the identification methods. Therefore, experimentallyderived molecular mass or base composition calculated therefrom, isqueried across the database's molecular masses or base compositionsindexed to the primer pair until a match is identified. Once a match isidentified the fungi bioagent is identified as being that indexed tosaid database's molecular mass or base composition. Theexperimentally-determined molecular mass or base composition may bewithin experimental error of the molecular mass or base composition of aknown bioagent identifying amplicon and still be classified as a match.Moreover, if an absolute match is not identified in the database, thefungi bioagent can still be identified using probability cloud, nearestneighbor and/or triangulation, as described herein and in U.S. PatentApplication No.: US2006-0259249, which is commonly owned andincorporated herein by reference in entirety. The present methodprovides rapid throughput and does not require nucleic acid sequencingof the amplified target sequence for bioagent detection andidentification.

Despite enormous biological diversity, all forms of life on earth sharesets of essential, common features in their genomes. Since genetic dataprovide the underlying basis for identification of bioagents by thecurrent methods, it is necessary to select segments of nucleic acidswhich ideally provide enough variability to distinguish each individualbioagent and whose molecular mass is amenable to molecular massdetermination.

In some embodiments, at least one fungal nucleic acid segment isamplified in the process of identifying the bioagent. Thus, the nucleicacid segments that can be amplified by the primers disclosed herein andthat provide enough variability to distinguish each individual fungalbioagent and whose molecular masses are amenable to molecular massdetermination are herein described as fungal bioagent identifyingamplicons or fungal identifying amplicons.

In some embodiments, bioagent identifying amplicons comprise from about45 to about 200 linked nucleosides, or any range therein. One ofordinary skill in the art will appreciate that this embodies compoundsof 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112,113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126,127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140,141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154,155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168,169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182,183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196,197, 198, 199, or 200 nucleobases in length.

It is the combination of the portions of the bioagent nucleic acidsegment to which the primers hybridize (hybridization sites) and thevariable region between the primer hybridization sites that comprisesthe bioagent identifying amplicon.

In some embodiments, bioagent identifying amplicons amenable tomolecular mass determination which are produced by the primers describedherein are either of a length, size or mass compatible with theparticular mode of molecular mass determination or compatible with ameans of providing a predictable fragmentation pattern in order toobtain predictable fragments of a length compatible with the particularmode of molecular mass determination. Such means of providing apredictable fragmentation pattern of an amplification product include,but are not limited to, cleavage with restriction enzymes or cleavageprimers, for example. Thus, in some embodiments, bioagent identifyingamplicons are larger than about 200 nucleobases and are amenable tomolecular mass determination following restriction digestion or otherfragmentation means such as chemical cleavage agents. Methods of usingrestriction enzymes and cleavage primers are well known to those withordinary skill in the art. Additionally, mass tags or nucleotide analogsare used.

In some embodiments, amplification products corresponding to bioagentidentifying amplicons are obtained using the polymerase chain reaction(PCR) that is a routine method to those with ordinary skill in themolecular biology arts. Other amplification methods may be used such asligase chain reaction (LCR), low-stringency single primer PCR, andmultiple strand displacement amplification (MDA). These methods are alsoknown to those with ordinary skill.

B. Oligonucleotide Primer Pairs and Forward and Reverse Primer Members

As stated above, the primer members are configured to bind to conservedsequence regions of a bioagent identifying amplicon that flank anintervening variable region and yield amplification products whichprovide variability sufficient to distinguish each individual bioagent,and which are amenable to molecular mass analysis. The molecular mass ofa given amplification product provides a means of identifying thebioagent from which it was obtained, due to the variability of thevariable region. Thus, configuration of the primers involves, amongstother things, selection of a variable region with sufficient variabilityto resolve the identity of a given bioagent. In some embodiments,bioagent identifying amplicons are specific to the identity of thebioagent.

In some embodiments, identification of bioagents is accomplished atdifferent levels using primers suited to resolution of each individuallevel of identification. Broad range survey primers are configured withthe objective of identifying a bioagent as a member of a particulardivision (e.g., an order, family, genus or other such grouping ofbioagents above the species level of bioagents). Drill-down primers areconfigured with the objective of identifying a bioagent at thesub-species level (including strains, subtypes, variants and isolates)based on sub-species characteristics. In some cases, the molecular massor base composition of a fungal bioagent identifying amplicon defined bya broad range survey primer pair does not provide enough resolution tounambiguously identify a fungal bioagent at or below the species level.These cases benefit from further analysis of one or more fungal bioagentidentifying amplicons generated from at least one additional broad rangesurvey primer pair or from at least one additional division-wide primerpair. The employment of more than one bioagent identifying amplicon foridentification of a bioagent is herein referred to as triangulationidentification.

In one embodiment, the method identifies a species of fungus or asub-species of fungus.

A preferred embodiment provides isolated oligonucleotide primer pairswherein each of the forward member and reverse member of the primer pairis independently 13-35 consecutive nucleobases in length and configuredto hybridize with at least 70% complementarity to a region of GenBankAccession Number X70659. In a further embodiment, said region is fromnucleobase 134 to nucleobase 269. In a further embodiment, said regionis from nucleobase 697 to nucleobase 1132. In an alternative embodiment,said region is from nucleobase 697 to nucleobase 834. In an alternativeembodiment, said region is from nucleobase 2472 to nucleobase 2624. Anadditional embodiment provides isolated oligonucleotide primer pairswherein each of the forward member and reverse member of the primer pairis independently 13-35 consecutive nucleobases in length and configuredto hybridize with at least 70% complementarity to a region of GenBankAccession Number X53497. Primer pairs are preferably configured togenerate amplification product that comprises a length from 45-200consecutive nucleobases.

A representative process flow diagram used for primer selection andvalidation process is outlined in FIG. 1. For each group of organisms,candidate target sequences are identified (200) from which nucleotidealignments are created (210) and analyzed (220). Primers are thenconfigured by selecting appropriate priming regions (230) to facilitatethe selection of candidate primer pairs (240). The primer pairs are thensubjected to in silico analysis by electronic PCR (ePCR) (300) whereinbioagent identifying amplicons are obtained from sequence databases suchas GenBank or other sequence collections (310) and checked forspecificity in silico (320). Bioagent identifying amplicons obtainedfrom GenBank sequences (310) can also be analyzed by a probability modelwhich predicts the capability of a given amplicon to identify unknownbioagents such that the base compositions of amplicons with favorableprobability scores are then stored in a base composition database (325).Alternatively, base compositions of the bioagent identifying ampliconsobtained from the primers and GenBank sequences can be directly enteredinto the base composition database (330). Candidate primer pairs (240)are validated by testing their ability to hybridize to target nucleicacid by an in vitro amplification by a method such as PCR analysis (400)of nucleic acid from a collection of organisms (410). Amplificationproducts thus obtained are analyzed by gel electrophoresis or by massspectrometry to confirm the sensitivity, specificity and reproducibilityof the primers used to obtain the amplification products (420).

Many of the important pathogens, including the organisms of greatestconcern as biowarfare agents, have been completely sequenced. Thiseffort has greatly facilitated the design of primers for the detectionof unknown bioagents. The combination of broad-range priming withdivision-wide and drill-down priming has been used very successfully inseveral applications of the technology, including environmentalsurveillance for biowarfare threat agents and clinical sample analysisfor medically important pathogens.

Synthesis of primers is well known and routine in the art. The primersmay be conveniently and routinely made through the well-known techniqueof solid phase synthesis. Equipment for such synthesis is sold byseveral vendors including, for example, Applied Biosystems (Foster City,Calif.). Any other means for such synthesis known in the art mayadditionally or alternatively be employed.

In some embodiments, primers are employed as compositions for use inmethods for identification of fungal bioagents as follows: a primer paircomposition is contacted with nucleic acid of an unknown fungalbioagent. The nucleic acid is then amplified by a nucleic acidamplification technique, such as PCR for example, to obtain anamplification product that represents a bioagent identifying amplicon.The molecular mass of each strand of the double-stranded amplificationproduct is determined by a molecular mass measurement technique such asmass spectrometry for example, wherein the two strands of thedouble-stranded amplification product are separated during theionization process. In some embodiments, the mass spectrometry iselectrospray Fourier transform ion cyclotron resonance mass spectrometry(ESI-FTICR-MS) or electrospray time of flight mass spectrometry(ESI-TOF-MS). A list of possible base compositions can be generated forthe molecular mass value obtained for each strand and the choice of thecorrect base composition from the list is facilitated by matching thebase composition of one strand with a complementary base composition ofthe other strand. The molecular mass or base composition thus determinedis then compared with a database of molecular masses or basecompositions of analogous bioagent identifying amplicons for knownfungi. A match between the molecular mass or base composition of theamplification product and the molecular mass or base composition of ananalogous bioagent identifying amplicon for a known fungal bioagentindicates the identity of the unknown bioagent. In some embodiments, theprimer pair used is one of the primer pairs of Table 2. In someembodiments, the method is repeated using a different primer pair toresolve possible ambiguities in the identification process or to improvethe confidence level for the identification assignment.

In some embodiments, a bioagent identifying amplicon may be producedusing only a single primer (either the forward or reverse primer of anygiven primer pair), provided an appropriate amplification method ischosen, such as, for example, low stringency single primer PCR(LSSP-PCR). Adaptation of this amplification method in order to producebioagent identifying amplicons can be accomplished by one with ordinaryskill in the art without undue experimentation.

In some embodiments, the oligonucleotide primers are broad range surveyprimers which hybridize to conserved regions of nucleic acid encodingthe 23S rRNA gene, 25S rRNA gene or the 18S rRNA gene (or between 80%and 100%, between 85% and 100%, between 90% and 100% or between 95% and100%) of known fungi and produce bioagent identifying amplicons.

In other embodiments, the oligonucleotide primers are division-wideprimers which hybridize to nucleic acid encoding genes of species withina genus of fungi. In other embodiments, the oligonucleotide primers aredrill-down primers which enable the identification of sub-speciescharacteristics. Drill down primers provide the functionality ofproducing bioagent identifying amplicons for drill-down analyses such asstrain typing when contacted with nucleic acid under amplificationconditions. Identification of such sub-species characteristics is oftencritical for determining proper clinical treatment of fungal infections.In some embodiments, sub-species characteristics are identified usingonly broad range survey primers and division-wide and drill-down primersare not used.

In some embodiments, the primers used for amplification hybridize to andamplify genomic DNA, DNA of bacterial plasmids, DNA of DNA viruses orDNA reverse transcribed from RNA of an RNA virus.

In some embodiments, the primers used for amplification hybridizedirectly to viral RNA and act as reverse transcription primers forobtaining DNA from direct amplification of viral RNA. Methods ofamplifying RNA to produce cDNA using reverse transcriptase are wellknown to those with ordinary skill in the art and can be routinelyestablished without undue experimentation.

In some embodiments, various computer software programs may be used toaid in design of primers for amplification reactions such as PrimerPremier 5 (Premier Biosoft, Palo Alto, Calif.) or OLIGO Primer AnalysisSoftware (Molecular Biology Insights, Cascade, Colo.). These programsallow the user to input desired hybridization conditions such as meltingtemperature of a primer-template duplex for example. In someembodiments, an in silico PCR search algorithm, such as (ePCR) is usedto analyze primer specificity across a plurality of template sequenceswhich can be readily obtained from public sequence databases such asGenBank for example. An existing RNA structure search algorithm (Mackeet al., Nucl. Acids Res., 2001, 29, 4724-4735, which is incorporatedherein by reference in its entirety) has been modified to include PCRparameters such as hybridization conditions, mismatches, andthermodynamic calculations (SantaLucia, Proc. Natl. Acad. Sci. U.S.A.,1998, 95, 1460-1465, which is incorporated herein by reference in itsentirety). This also provides information on primer specificity of theselected primer pairs. In some embodiments, the hybridization conditionsapplied to the algorithm can limit the results of primer specificityobtained from the algorithm. In some embodiments, the meltingtemperature threshold for the primer template duplex is specified to be35.deg.C or a higher temperature. In some embodiments the number ofacceptable mismatches is specified to be seven mismatches or less. Insome embodiments, the buffer components and concentrations and primerconcentrations may be specified and incorporated into the algorithm, forexample, an appropriate primer concentration is about 250 nM andappropriate buffer components are 50 mM sodium or potassium and 1.5 mMMg.sup.2+.

One with ordinary skill in the art will recognize that a given primerneed not hybridize with 100% complementarity in order to effectivelyprime the synthesis of a complementary nucleic acid strand in anamplification reaction. Moreover, a primer may hybridize over one ormore segments such that intervening or adjacent segments are notinvolved in the hybridization event. (e.g., for example, a loopstructure or a hairpin structure). Primer members are configured to have70% or greater complementarity to at least two fungi bioagent nucleicacids in an alignment. Additionally, the primers can be configured tocomprise at least 70%, at least 75%, at least 80%, at least 85%, atleast 90%, at least 95% or at least 99% sequence identity with any ofthe primers listed in Table 2. Thus, in some embodiments, an extent ofvariation of 70% to 100%, or any range therewithin, of the sequenceidentity is possible relative to the specific primer sequences disclosedherein. Determination of sequence identity is described in the followingexample: a primer 20 nucleobases in length which is identical to another20 nucleobase primer having two non-identical residues has 18 of 20identical residues (18/20=0.9 or 90% sequence identity). In anotherexample, a primer 15 nucleobases in length having all residues identicalto a 15 nucleobase segment of primer 20 nucleobases in length would have15/20=0.75 or 75% sequence identity with the nucleobase primer.

Percent homology, sequence identity or complementarity, can bedetermined by, for example, the Gap program (Wisconsin Sequence AnalysisPackage, Version 8 for UNIX, Genetics Computer Group, UniversityResearch Park, Madison Wis.), using default settings, which uses thealgorithm of Smith and Waterman (Adv. Appl. Math., 1981, 2, 482-489). Insome embodiments, complementarity of primers with respect to theconserved priming regions of viral nucleic acid is between about 70% andabout 75% 80%. In other embodiments, homology, sequence identity orcomplementarity, is between about 75% and about 80%. In yet otherembodiments, homology, sequence identity or complementarity, is at least85%, at least 90%, at least 92%, at least 94%, at least 95%, at least96%, at least 97%, at least 98%, at least 99% or is 100%.

In some embodiments, the primers described herein comprise at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, atleast 94%, at least 95%, at least 96%, at least 98%, or at least 99%, or100% (or any range therewithin) sequence identity with the primersequences specifically disclosed herein.

One with ordinary skill is able to calculate percent sequence identityor percent sequence homology and able to determine, without undueexperimentation, the effects of variation of primer sequence identity onthe function of the primer in its role in priming synthesis of acomplementary strand of nucleic acid for production of an amplificationproduct of a corresponding bioagent identifying amplicon.

In one embodiment, the primers are at least 13 nucleobases in length. Inanother embodiment, the primers are less than 36 nucleobases in length.

In some embodiments the primer members of the oligonucleotide primerpair are 13 to 35 nucleobases in length (13 to 35 linked nucleotideresidues). These embodiments comprise oligonucleotide primers 13, 14,15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,33, 34 or 35 nucleobases in length, or any range therewithin. Thepresent invention contemplates using both longer and shorter primers.Furthermore, the primers may also be linked to one or more other desiredmoieties, including, but not limited to, affinity groups, ligands,regions of nucleic acid that are not complementary to the nucleic acidto be amplified, labels, etc. Primers may also form hairpin structures.For example, hairpin primers may be used to amplify short target nucleicacid molecules. The presence of the hairpin may stabilize theamplification complex (see e.g., TAQMAN MicroRNA Assays, AppliedBiosystems, Foster City, Calif.).

In some embodiments, any oligonucleotide primer pair may have one orboth primers with less than 70% sequence homology with a correspondingmember of any of the primer pairs of Table 2 if the primer pair has thecapability of producing an amplification product corresponding to abioagent identifying amplicon. In other embodiments, any oligonucleotideprimer pair may have one or both primers with a length greater than 35nucleobases if the primer pair has the capability of producing anamplification product corresponding to a bioagent identifying amplicon.

In some embodiments, the function of a given primer may be substitutedby a combination of two or more primers segments that hybridize adjacentto each other or that are linked by a nucleic acid loop structure orlinker which allows a polymerase to extend the two or more primers in anamplification reaction.

In some embodiments, the isolated oligonucleotide primer pairs used forobtaining bioagent identifying amplicons are listed in Table 2. In otherembodiments, other primer pairs are possible by combining certainmembers of the forward primers with certain members of the reverseprimers. An example can be seen in Table 2 for three primer paircombinations of forward primer 25S_X70659_(—)134_(—)159_F (SEQ ID NO:8), with the reverse primers 25S_X70659_(—)247_(—)269_F (SEQ ID NO: 22),or 25S_x70659_(—)235_(—)258_F (SEQ ID NO: 23). Arriving at a favorablealternate combination of primers in a primer pair depends upon theproperties of the primer pair, most notably the size of the bioagentidentifying amplicon that would be produced by the primer pair, whichshould be between about 45 to about 200 nucleobases in length.Alternatively, a bioagent identifying amplicon longer than about 200nucleobases in length could be cleaved into smaller segments by cleavagereagents such as chemical reagents, or restriction enzymes, for example.

In some embodiments, the primers are configured to amplify nucleic acidof a bioagent to produce amplification products that can be measured bymass spectrometry and from whose molecular masses candidate basecompositions can be readily calculated.

In some embodiments, any given primer comprises a modificationcomprising the addition of a non-templated T residue to the 5′ end ofthe primer (i.e., the added T residue does not necessarily hybridize tothe nucleic acid being amplified). The addition of a non-templated Tresidue has an effect of minimizing the addition of non-templatedadenosine residues as a result of the non-specific enzyme activity ofTaq polymerase (Magnuson et al., Biotechniques, 1996, 21, 700-709), anoccurrence which may lead to ambiguous results arising from molecularmass analysis.

In some embodiments, primers may contain one or more universal bases.Because any variation (due to codon wobble in the 3^(rd) position) inthe conserved regions among species is likely to occur in the thirdposition of a DNA (or RNA) triplet, oligonucleotide primers can beconfigured such that the nucleotide corresponding to this position is abase which can bind to more than one nucleotide, referred to herein as a“universal nucleobase.” For example, under this “wobble” pairing,inosine (I) binds to U, C or A; guanine (G) binds to U or C, and uridine(U) binds to U or C. Other examples of universal nucleobases includenitroindoles such as 5-nitroindole or 3-nitropyrrole (Loakes et al.,Nucleosides and Nucleotides, 1995, 14, 1001-1003), the degeneratenucleotides dP or dK (Hill et al.), an acyclic nucleoside analogcontaining 5-nitroindazole (Van Aerschot et al., Nucleosides andNucleotides, 1995, 14, 1053-1056) or the purine analog1-(2-deoxy-.beta.-D-ribofuranosyl)-imidazole-4-carboxamide (Sala et al.,Nucl. Acids Res., 1996, 24, 3302-3306).

In some embodiments, to compensate for the somewhat weaker binding bythe wobble base, the oligonucleotide primers are configured such thatthe first and second positions of each triplet are occupied bynucleotide analogs that bind with greater affinity than the unmodifiednucleotide. Examples of these analogs include, but are not limited to,2,6-diaminopurine which binds to thymine, 5-propynyluracil which bindsto adenine and 5-propynylcytosine and phenoxazines, including G-clamp,which binds to G. Propynylated pyrimidines are described in U.S. Pat.Nos. 5,645,985, 5,830,653 and 5,484,908, each of which is commonly ownedand incorporated herein by reference in its entirety. Propynylatedprimers are described in U.S Pre-Grant Publication No. 2003-0170682,which is also commonly owned and incorporated herein by reference in itsentirety. Phenoxazines are described in U.S. Pat. Nos. 5,502,177,5,763,588, and 6,005,096, each of which is incorporated herein byreference in its entirety. G-clamps are described in U.S. Pat. Nos.6,007,992 and 6,028,183, each of which is incorporated herein byreference in its entirety.

In some embodiments, non-template primer tags are used to increase themelting temperature (T.sub.m) of a primer-template duplex in order toimprove amplification efficiency. A non-template tag is at least threeconsecutive A or T nucleotide residues on a primer which are notcomplementary to the template. In any given non-template tag, A can bereplaced by C or G and T can also be replaced by C or G. AlthoughWatson-Crick hybridization is not expected to occur for a non-templatetag relative to the template, the extra hydrogen bond in a G-C pairrelative to an A-T pair confers increased stability of theprimer-template duplex and improves amplification efficiency forsubsequent cycles of amplification when the primers hybridize to strandssynthesized in previous cycles.

In other embodiments, propynylated tags may be used in a manner similarto that of the non-template tag, wherein two or more5-propynyl-2-deoxycytidine or 5-propynyl-2-deoxythymidine (equivalent to5-propynyl-2-deoxyuridine) residues replace template matching residueson a primer. In other embodiments, a primer contains a modifiedinternucleoside linkage such as a phosphorothioate linkage, for example.

In some embodiments, the primers contain mass-modifying tags. Reducingthe total number of possible base compositions of a nucleic acid ofspecific molecular weight provides a means of avoiding a persistentsource of ambiguity in determination of base composition ofamplification products. Addition of mass-modifying tags to certainnucleobases of a given primer will result in simplification of de novodetermination of base composition of a given bioagent identifyingamplicon from its molecular mass.

In some embodiments, the mass modified nucleobase comprises one or moreof the following: for example, 7-deaza-2′-deoxyadenosine-5-triphosphate,5-iodo-2′-deoxyuridine-5′-triphosphate,5-bromo-2′-deoxyuridine-5′-triphosphate,5-bromo-2′-deoxycytidine-5′-triphosphate,5-iodo-2′-deoxycytidine-5′-triphosphate,5-hydroxy-2′-deoxyuridine-5′-triphosphate,4-thiothymidine-5′-triphosphate, 5-aza-2′-deoxyuridine-5′-triphosphate,5-fluoro-2′-deoxyuridine-5′-triphosphate,O6-methyl-2′-deoxyguanosine-5′-triphosphate,N2-methyl-2′-deoxyguanosine-5′-triphosphate,8-oxo-2′-deoxyguanosine-5′-triphosphate orthiothymidine-5′-triphosphate. In some embodiments, the mass-modifiednucleobase comprises .sup.15N or .sup.13C or both .sup.15N and .sup.13C.

In some embodiments, multiplex amplification is performed where multiplebioagent identifying amplicons are amplified with a plurality of primerpairs. The advantages of multiplexing are that fewer reaction containers(for example, wells of a 96- or 384-well plate) are needed for eachmolecular mass measurement, providing time, resource and cost savingsbecause additional bioagent identification data can be obtained within asingle analysis. Multiplex amplification methods are well known to thosewith ordinary skill and can be developed without undue experimentation.However, in some embodiments, one useful and non-obvious step inselecting a plurality candidate bioagent identifying amplicons formultiplex amplification is to ensure that each strand of eachamplification product will be sufficiently different in molecular massthat mass spectral signals will not overlap and lead to ambiguousanalysis results. In some embodiments, a 10 Da difference in mass of twostrands of one or more amplification products is sufficient to avoidoverlap of mass spectral peaks.

In some embodiments, as an alternative to multiplex amplification,single amplification reactions can be pooled before analysis by massspectrometry. In these embodiments, as for multiplex amplificationembodiments, it is useful to select a plurality of candidate bioagentidentifying amplicons to ensure that each strand of each amplificationproduct will be sufficiently different in molecular mass that massspectral signals will not overlap and lead to ambiguous analysisresults.

C Determination of Molecular Mass of Bioagent Identifying Amplicons

In some embodiments, the molecular mass of a given bioagent identifyingamplicon is determined by mass spectrometry. Mass spectrometry hasseveral advantages, not the least of which is high bandwidthcharacterized by the ability to separate (and isolate) many molecularpeaks across a broad range of mass to charge ratio (m/z). Thus massspectrometry is intrinsically a parallel detection scheme without theneed for probes, since every amplification product is identified by itsmolecular mass. The current state of the art in mass spectrometry issuch that less than femtomole quantities of material can be readilyanalyzed to afford information about the molecular contents of thesample. An accurate assessment of the molecular mass of the material canbe quickly obtained, irrespective of whether the molecular weight of thesample is several hundred, or in excess of one hundred thousand atomicmass units (amu) or Daltons.

In some embodiments, intact molecular ions are generated fromamplification products using one of a variety of ionization techniquesto convert the sample to gas phase. These ionization methods include,but are not limited to, electrospray ionization (ES), matrix-assistedlaser desorption ionization (MALDI) and fast atom bombardment (FAB).Upon ionization, several peaks are observed from one sample due to theformation of ions with different charges. Averaging the multiplereadings of molecular mass obtained from a single mass spectrum affordsan estimate of molecular mass of the bioagent identifying amplicon.Electrospray ionization mass spectrometry (ESI-MS) is particularlyuseful for very high molecular weight polymers such as proteins andnucleic acids having molecular weights greater than 10 kDa, since ityields a distribution of multiply-charged molecules of the samplewithout causing a significant amount of fragmentation.

The mass detectors used can include, but are not limited to, Fouriertransform ion cyclotron resonance mass spectrometry (FT-ICR-MS), time offlight (TOF), ion trap, quadrupole, magnetic sector, Q-TOF, and triplequadrupole.

D. Base Compositions of Bioagent Identifying Amplicons

Although the molecular mass of amplification products obtained usingintelligent primers provides a means for identification of bioagents,conversion of molecular mass data to a base composition signature isuseful for certain analyses. Base composition is defined above as beingthe number of individual residues in an amplicon (natural and analog).Base composition is independent of the linear arrangement of saidindividual residues. In some embodiments, a base composition provides anindex of a specific organism. Base compositions can be calculated fromknown sequences of known bioagent identifying amplicons and can beexperimentally determined by measuring the molecular mass of a givenbioagent identifying amplicon, followed by determination of all possiblebase compositions which are consistent with the measured molecular masswithin acceptable experimental error. The following example illustratesdetermination of base composition from an experimentally obtainedmolecular mass of a 46-mer amplification product originating at position1337 of the 16S rRNA of Bacillus anthracis. The forward and reversestrands of the amplification product have measured molecular masses of14208 and 14079 Da, respectively. The possible base compositions derivedfrom the molecular masses of the forward and reverse strands for the B.anthracis products are listed in Table 1.

TABLE 1 Possible Base Compositions for B. anthracia 46mer AmplificationProduct Calc. Mass Mass Error Base Calc. Mass Mass Error Base ForwardForward Composition of Reverse Reverse Composition of Strand StrandForward Strand Strand Strand Reverse Strand 14208.2935 0.079520 A1 G17C10 T18 14079.2624 0.080600 A0 G14 C13 T19 14208.3160 0.056980 A1 G20C15 T10 14079.2849 0.058060 A0 G17 C18 T11 14208.3386 0.034440 A1 G23C20 T2 14079.3075 0.035520 A0 G20 C23 T3 14208.3074 0.065560 A6 G11 C3T26 14079.2538 0.089180 A5 G5 C1 T35 14208.3300 0.043020 A6 G14 C8 T1814079.2764 0.066640 A5 G8 C6 T27 14208.3525 0.020480 A6 G17 C13 T1014079.2989 0.044100 A5 G11 C11 T19 14208.3751 0.002060 A6 G20 C18 T214079.3214 0.021560 A5 G14 C16 T11 14208.3439 0.029060 A11 G8 C1 T2614079.3440 0.000980 A5 G17 C21 T3 14208.3665 0.006520 A11 G11 C6 T1814079.3129 0.030140 A10 G5 C4 T27 14208.3890 0.016020 A11 G14 C11 T1014079.3354 0.007600 A10 G8 C9 T19 14208.4116 0.038560 A11 G17 C16 T214079.3579 0.014940 A10 G11 C14 T11 14208.4030 0.029980 A16 G8 C4 T1814079.3805 0.037480 A10 G14 C19 T3 14208.4255 0.052520 A16 G11 C9 T1014079.3494 0.006360 A15 G2 C2 T27 14208.4481 0.075060 A16 G14 C14 T214079.3719 0.028900 A15 G5 C7 T19 14208.4395 0.066480 A21 G5 C2 T1814079.3944 0.051440 A15 G8 C12 T11 14208.4620 0.089020 A21 G8 C7 T1014079.4170 0.073980 A15 G11 C17 T3 — — — 14079.4084 0.065400 A20 G2 C5T19 — — — 14079.4309 0.087940 A20 G5 C10 T13

Among the 16 possible base compositions for the forward strand and the18 possible base compositions for the reverse strand that werecalculated, only one pair (shown in bold) are complementary basecompositions, which indicates the true base composition of theamplification product. It should be recognized that this logic isapplicable for determination of base compositions of any bioagentidentifying amplicon, regardless of the class of bioagent from which thecorresponding amplification product was obtained.

In some embodiments, assignment of previously unobserved basecompositions (also known as “true unknown base compositions”) to a givenphylogeny can be accomplished via the use of pattern classifier modelalgorithms. Base compositions, like sequences, vary slightly from strainto strain within species, for example. In some embodiments, the patternclassifier model is the mutational probability model. On otherembodiments, the pattern classifier is the polytope model. Themutational probability model and polytope model are both commonly ownedand described in U.S. Publication No. US2006-0259249 which isincorporated herein by reference in entirety.

In one embodiment, it is possible to manage this diversity by building“base composition probability clouds” around the composition constraintsfor each species. This permits identification of organisms in a fashionsimilar to sequence analysis. A “pseudo four-dimensional plot” can beused to visualize the concept of base composition probability clouds.Optimal primer design requires optimal choice of bioagent identifyingamplicons and maximizes the separation between the base compositionsignatures of individual bioagents. Areas where clouds overlap indicateregions that may result in a misclassification, a problem which isovercome by a triangulation identification process using bioagentidentifying amplicons not affected by overlap of base compositionprobability clouds.

In some embodiments, base composition probability clouds provide themeans for screening potential primer pairs in order to avoid potentialmisclassifications of base compositions. In other embodiments, basecomposition probability clouds provide the means for predicting theidentity of a bioagent whose assigned base composition was notpreviously observed and/or indexed in a bioagent identifying ampliconbase composition database due to evolutionary transitions in its nucleicacid sequence. Thus, in contrast to probe-based techniques, massspectrometry determination of base composition does not require priorknowledge of the composition or sequence in order to make themeasurement.

Provided are bioagent classifying information similar to DNA sequencingand phylogenetic analysis at a level sufficient to identify a givenbioagent. Furthermore, the process of determination of a previouslyunknown base composition for a given bioagent (for example, in a casewhere sequence information is unavailable) has downstream utility byproviding additional bioagent indexing information with which topopulate base composition databases. The process of future bioagentidentification is thus greatly improved as more BCS indexes becomeavailable in base composition databases.

E. Triangulation Identification

In some cases, a molecular mass of a single bioagent identifyingamplicon alone does not provide enough resolution to unambiguouslyidentify a given bioagent. The employment of more than one bioagentidentifying amplicon for identification of a bioagent is herein referredto as “triangulation identification.” Triangulation identification ispursued by determining the molecular masses of a plurality of bioagentidentifying amplicons selected within a plurality of housekeeping genes.This process is used to reduce false negative and false positivesignals, and enable reconstruction of the origin of hybrid or otherwiseengineered bioagents. For example, identification of the three parttoxin genes typical of B. anthracis (Bowen et al., J. Appl. Microbiol.,1999, 87, 270-278) in the absence of the expected signatures from the B.anthracis genome would suggest a genetic engineering event.

In some embodiments, the triangulation identification process can bepursued by characterization of bioagent identifying amplicons in amassively parallel fashion using the polymerase chain reaction (PCR),such as multiplex PCR where multiple primers are employed in the sameamplification reaction mixture, or PCR in multi-well plate formatwherein a different and unique pair of primers is used in multiple wellscontaining otherwise identical reaction mixtures. Such multiplex andmulti-well PCR methods are well known to those with ordinary skill inthe arts of rapid throughput amplification of nucleic acids. In otherrelated embodiments, one PCR reaction per well or container may becarried out, followed by an amplicon pooling step wherein theamplification products of different wells are combined in a single wellor container which is then subjected to molecular mass analysis. Thecombination of pooled amplicons can be chosen such that the expectedranges of molecular masses of individual amplicons are not overlappingand thus will not complicate identification of signals.

F. Codon Base Composition Analysis

In some embodiments, one or more nucleotide substitutions within a codonof a gene of an infectious organism confer drug resistance upon anorganism which can be determined by codon base composition analysis. Theorganism can be a bacterium, virus, fungus or protozoan.

In one embodiment, the amplification product containing the codon beinganalyzed is of a length of about 35 to about 200 nucleobases. Theprimers employed in obtaining the amplification product can hybridize toupstream and downstream sequences directly adjacent to the codon, or canhybridize to upstream and downstream sequences one or more sequencepositions away from the codon. The primers may have at least 70%sequence complementarity with the sequence of the gene containing thecodon being analyzed.

In some embodiments, the codon analysis is undertaken for the purpose ofinvestigating genetic disease in an individual. In other embodiments,the codon analysis is undertaken for the purpose of investigating a drugresistance mutation or any other deleterious mutation in an infectiousorganism such as a bacterium, virus, fungus or protozoan.

In some embodiments, the molecular mass of an amplification productcontaining the codon being analyzed is measured by mass spectrometry.The mass spectrometry can be either electrospray (ESI) mass spectrometryor matrix-assisted laser desorption ionization (MALDI) massspectrometry. Time-of-flight (TOF) is an example of one mode of massspectrometry compatible with the analyses methods.

The methods can also be employed to determine the relative abundance ofdrug resistant strains of the organism being analyzed. Relativeabundances can be calculated from amplitudes of mass spectral signalswith relation to internal calibrants. In some embodiments, knownquantities of internal amplification calibrants can be included in theamplification reactions and abundances of analyte amplification productestimated in relation to the known quantities of the calibrants.

In some embodiments, upon identification of one or more drug-resistantstrains of an infectious organism infecting an individual, one or morealternative treatments can be devised to treat the individual.

G. Determination of the Quantity of a Bioagent

In some embodiments, the identity and quantity of an unknown bioagentcan be determined using the process illustrated in FIG. 2. Primers (500)and a known quantity of a calibration polynucleotide (505) are added toa sample containing nucleic acid of an unknown bioagent. The totalnucleic acid in the sample is then subjected to an amplificationreaction (510) to obtain amplification products. The molecular masses ofamplification products are determined (515) from which are obtainedmolecular mass and abundance data. The molecular mass of the bioagentidentifying amplicon (520) provides the means for its identification(525) and the molecular mass of the calibration amplicon obtained fromthe calibration polynucleotide (530) provides the means for itsidentification (535). The abundance data of the bioagent identifyingamplicon is recorded (540) and the abundance data for the calibrationdata is recorded (545), both of which are used in a calculation (550)which determines the quantity of unknown bioagent in the sample.

A sample comprising an unknown bioagent is contacted with a pair ofprimers that provide the means for amplification of nucleic acid fromthe bioagent, and a known quantity of a polynucleotide that comprises acalibration sequence. The nucleic acids of the bioagent and of thecalibration sequence are amplified and the rate of amplification isreasonably assumed to be similar for the nucleic acid of the bioagentand of the calibration sequence. The amplification reaction thenproduces two amplification products: a bioagent identifying amplicon anda calibration amplicon. The bioagent identifying amplicon and thecalibration amplicon should be distinguishable by molecular mass whilebeing amplified at essentially the same rate. Effecting differentialmolecular masses can be accomplished by choosing as a calibrationsequence, a representative bioagent identifying amplicon (from aspecific species of bioagent) and performing, for example, a 2-8nucleobase deletion or insertion within the variable region between thetwo priming sites. The amplified sample containing the bioagentidentifying amplicon and the calibration amplicon is then subjected tomolecular mass analysis by mass spectrometry, for example. The resultingmolecular mass analysis of the nucleic acid of the bioagent and of thecalibration sequence provides molecular mass data and abundance data forthe nucleic acid of the bioagent and of the calibration sequence. Themolecular mass data obtained for the nucleic acid of the bioagentenables identification of the unknown bioagent and the abundance dataenables calculation of the quantity of the bioagent, based on theknowledge of the quantity of calibration polynucleotide contacted withthe sample.

In some embodiments, construction of a standard curve where the amountof calibration polynucleotide spiked into the sample is varied providesadditional resolution and improved confidence for the determination ofthe quantity of bioagent in the sample. The use of standard curves foranalytical determination of molecular quantities is well known to onewith ordinary skill and can be performed without undue experimentation.

In some embodiments, multiplex amplification is performed where multiplebioagent identifying amplicons are amplified with multiple primer pairswhich also amplify the corresponding standard calibration sequences. Inthis or other embodiments, the standard calibration sequences areoptionally included within a single vector which functions as thecalibration polynucleotide. Multiplex amplification methods are wellknown to those with ordinary skill and can be performed without undueexperimentation.

In some embodiments, the calibrant polynucleotide is used as an internalpositive control to confirm that amplification conditions and subsequentanalysis steps are successful in producing a measurable amplicon. Evenin the absence of copies of the genome of a bioagent, the calibrationpolynucleotide should give rise to a calibration amplicon. Failure toproduce a measurable calibration amplicon indicates a failure ofamplification or subsequent analysis step such as amplicon purificationor molecular mass determination. Reaching a conclusion that suchfailures have occurred is in itself, a useful event.

In some embodiments, the calibration sequence is comprised of DNA. Insome embodiments, the calibration sequence is comprised of RNA.

In some embodiments, the calibration sequence is inserted into a vectorthat itself functions as the calibration polynucleotide. In someembodiments, more than one calibration sequence is inserted into thevector that functions as the calibration polynucleotide. Such acalibration polynucleotide is herein termed a “combination calibrationpolynucleotide.” The process of inserting polynucleotides into vectorsis routine to those skilled in the art and can be accomplished withoutundue experimentation. Thus, it should be recognized that thecalibration method should not be limited to the embodiments describedherein. The calibration method can be applied for determination of thequantity of any bioagent identifying amplicon when an appropriatestandard calibrant polynucleotide sequence is configured and used. Theprocess of choosing an appropriate vector for insertion of a calibrantis also a routine operation that can be accomplished by one withordinary skill without undue experimentation.

H. Identification of Fungi

In certain embodiments, the primer pairs produce bioagent identifyingamplicons within stable and conserved regions of fungi. Characterizationof an amplicons generated from priming conserved region is preferredbecause it provides a low probability that the region will evolve pastthe point of primer recognition, in which case, the amplification stepwould loose resolution of fungi bioagent and will eventually fail. Sucha primer set is thus useful as a broad range survey-type primer. Inanother embodiment, the primers produce bioagent identifying ampliconsin a region which does evolves more quickly than the stable regiondescribed above. The advantage of characterization bioagent identifyingamplicon corresponding to an evolving genomic region is that it isuseful for distinguishing emerging strain variants. In this embodimentthe primer pairs are configured to encompass the rapidly evolving regionsuch that the base composition signature for fungi bioagents can beplotted and traced through a phylogenetic tree.

Thus provided is a platform for identification of diseases caused bypathogenic fungi. The present invention eliminates the need for priorknowledge of bioagent sequence to generate hybridization probes becauseprobes are not necessary. Thus, in another embodiment, there is provideda means of determining the etiology of a fungal infection when theprocess of identification of fungi is carried out in a clinical settingand, even when the fungus is a new species never observed before. Thisis possible because, as described directly above, the methods are notconfounded by naturally occurring evolutionary variations occurring inthe sequence acting as the template for production of the bioagentidentifying amplicon. Measurement of molecular mass and determination ofbase composition is accomplished in an unbiased manner without sequenceprejudice.

Also provided is a means of tracking the spread of any species or strainof fungus when a plurality of samples obtained from different locationsare analyzed by the methods described above in an epidemiologicalsetting. In one embodiment, a plurality of samples from a plurality ofdifferent locations is analyzed with primer pairs which produce bioagentidentifying amplicons, a subset of which contains a specific fungus. Thecorresponding locations of the members of the fungus-containing subsetindicate the spread of the specific fungus to the correspondinglocations.

I. Kits

Also provided are kits for carrying out the methods described herein. Insome embodiments, the kit may comprise a sufficient quantity of one ormore primer pairs to perform an amplification reaction on a targetpolynucleotide from a bioagent to form a bioagent identifying amplicon.In some embodiments, the kit may comprise from one to fifty primerpairs, from one to twenty primer pairs, from one to ten primer pairs, orfrom two to five primer pairs. In some embodiments, the kit may compriseone or more primer pair(s) recited in Table 2.

In some embodiments, the kit comprises one or more broad range surveyprimer(s), division wide primer(s), or drill-down primer(s), or anycombination thereof. If a given problem involves identification of aspecific bioagent, the solution to the problem may require the selectionof a particular combination of primers to provide the solution to theproblem. A kit may be configured so as to comprise particular primerpairs for identification of a particular bioagent. A drill-down kit maybe used, for example, to distinguish different sub-species types offungi. In some embodiments, the primer pair components of any of thesekits may be additionally combined to comprise additional combinations ofbroad range survey primers and division-wide primers so as to be able toidentify the fungus.

In some embodiments, the kit contains standardized calibrationpolynucleotides for use as internal amplification calibrants. Internalcalibrants are described in commonly owned International Application WO2005/094421, which is incorporated herein by reference in its entirety.

In some embodiments, the kit comprises a sufficient quantity of reversetranscriptase (if an RNA virus is to be identified for example), a DNApolymerase, suitable nucleoside triphosphates (including alternativedNTPs such as inosine or modified dNTPs such as the 5-propynylpyrimidines or any dNTP containing molecular mass-modifying tags such asthose described above), a DNA ligase, and/or reaction buffer, or anycombination thereof, for the amplification processes described above. Akit may further include instructions pertinent for the particularembodiment of the kit, such instructions describing the primer pairs andamplification conditions for operation of the method. A kit may alsocomprise amplification reaction containers such as microcentrifuge tubesand the like. A kit may also comprise reagents or other materials forisolating bioagent nucleic acid or bioagent identifying amplicons fromamplification, including, for example, detergents, solvents, or ionexchange resins which may be linked to magnetic beads. A kit may alsocomprise a table of measured or calculated molecular masses and/or basecompositions of bioagents using the primer pairs of the kit.

In some embodiments, the kit includes a computer program stored on acomputer formatted medium (such as a compact disk or portable USB diskdrive, for example) comprising instructions which direct a processor toanalyze data obtained from the use of the primer pairs. The instructionsof the software transform data related to amplification products into amolecular mass or base composition which is a useful concrete andtangible result used in identification and/or classification ofbioagents. In some embodiments, the kits contain all of the reagentssufficient to carry out one or more of the methods described herein.

While the present compositions and methods has been described withspecificity in accordance with certain of its embodiments, the followingexamples serve only as illustration and are not intended as limitation.It should be understood that these examples are for illustrativepurposes only and are not to be construed as limiting in any manner.

EXAMPLES Example 1 Configuration and Validation of Primers that DefineBioagent Identifying Amplicons for Fungi A. General Process of PrimerConfiguration

For configuration of primers that define fungal identifying amplicons, aseries of fungal genome segment sequences were obtained, aligned andscanned for regions where pairs of PCR primers would amplify products ofabout 45 to about 200 nucleotides in length and distinguish speciesand/or individual strains from each other by their molecular masses orbase compositions. A typical process shown in FIG. 1 is employed forthis type of analysis.

A database of expected base compositions for each primer region wasgenerated using an in silico PCR search algorithm, such as (ePCR). Anexisting RNA structure search algorithm (Macke et al., Nucl. Acids Res.,2001, 29, 4724-4735, which is incorporated herein by reference in itsentirety) has been modified to include PCR parameters such ashybridization conditions, mismatches, and thermodynamic calculations(SantaLucia, Proc. Natl. Acad. Sci. U.S.A., 1998, 95, 1460-1465, whichis incorporated herein by reference in its entirety). This also providesinformation on primer specificity of the selected primer pairs.

B. Design of Primers for Identification of Fungi

A series of primer pairs (Table 2) have been configured which targetribosomal RNA genes (23S, 25S and 18S) from fungi. The rRNA sequences,obtained from the European Ribosomal Database and public fungal genomesequencing projects, were aligned to each other and to the analogousHomo sapiens 5.8 rRNA sequence (GenBank accession #J01866). Primers wereselected to specifically exclude amplification of human DNA. Primer pairnumbers in Table 2: 3029 (SEQ ID NOs: 9:24), 3030 (SEQ ID NOs: 10:25),3031 (SEQ ID NOs: 11:26), and 3032 (SEQ ID NOs: 12:27) were compared tothe non-redundant GenBank nucleotide database in a theoretical,electronic PCR pre-screening process. Five mismatches were allowed toeach primer and a thermodynamic model incorporating base mismatchparameters was utilized to give a global view of phylogeneticspecificity and potential cross-reactivity for the four primers. None ofthe four primers produced a predicted PCR product from any member of thephylum Chordata, suggesting that human and other vertebrate DNAs are notlikely to provide a viable template that would inhibit amplification offungal DNA from clinical specimens.

As a non-limiting example, primer pair number 3030 hybridizes to 25SrRNA and produces amplification products of the following species offungi: Candida albicans, Candida dubliniensis, Candida glabrata,Uncinocarpus reesii, Eremothecium gossypii, Saccharomyces cerevisiae,Aspergillus oryzae, Aspergillus fumigatus, Aspergillus terreus,Ajellomyces capsulatus, Neosartorya fischeri, Penicillium verruculosum,Chaetomium globosum, Gibberella moniliformis, Hypocrea jecorina,Verticillium dahliae, Magnaporthe grisea, Symbiotaphrina kochii,Phaeosphaeria nodoru, Lecophagus sp, Botryotinia fuckeliana, Arxulaadeninivorans, Saccharomycopsis fibuligera, Schizosaccharomycesjaponicus, Schizosaccharomyces pombe, Endogone pisiformis, Tricholomamatsutake, Pneumocystis carinii, Rhizomucor miehei, Mucor racemosus,Rhizopus stolonifer, Endogone lactriflua, Phycomyces blakesleeanus,Cokeromyces recurvatus, Mortierella verticillata, Cryptococcusneoformans, Basidiobolus ranarum, Umbelopsis ramanniana, Mortierellasp., Smittium culisetae, Furculomyces boomerangu, Piptocephaliscorymbifera, Kuzuhaea moniliformis, Conidiobolus coronatus,Entomophthora muscae, Dimargaris bacillispora, Orphella haysii,Spiromyces aspiralis, Spiromyces minutus, Coemansia reversa,Rhopalomyces elegans, and Bdelloura candida.

Table 2 represents a collection of primers (sorted by primer pairnumber) configured to identify fungi using the methods described herein.Tp represents propynylated T and Cp represents propynylated Cp, whereinthe propynyl substituent is located at the 5-position of the pyrimidinenucleobase. The primer pair number is an in-house database index number.The forward or reverse primer name shown in Table 2 indicates the generegion of the fungal genome to which the primer hybridizes relative to areference sequence. The forward primer name25SCANDIDA_X70659_(—)996_(—)1022_F indicates that the forward primer(_F) hybridizes to residues 996-1022 of a reference 25S rRNA Candidasequence (GenBank Accession Number X70659). GenBank Accession numberX53497 (Candida albicans 16S rRNA) is also used for primerconfiguration. It is notable that the gene nomenclature used isconsistent with what is reported in GenBank for these accession numbers.However, and as those or ordinary skill in the art know, nomenclatureoften differs. For instance, the 16S nomenclature used for X53497 isalso referred to as 18S. So too is 25S of X70659 also referred to as23S. These and other ribosomal genes and their nomenclature are known tothose ordinarily skilled in the art.

TABLE 2 Primer Pairs for Identification of Fungi Prim- For- Re- er wardverse Pair SEQ SEQ Num- Forward ID Reverse Reverse ID berForward Primer Name Sequence NO: Primer Name Sequence NO:  88416S_X53497_1490_1516_F TCGAGGTCTGGGTA  1 18S_X53497_1550_1574_RTGCGAGGTATTCC 15 ATCTTGTGAAACT TCGTTGAAGAGC  885 16S_X53497_1302_1325_FTCGATAACGAACGA  2 18S_X53497_1392_1417_R TCCTGTTATTGCC 16 GACCTTAACCTCAAACTTCCATC  886 16S_X53497_1298_1323_F TGCTGCGATAACGA  318S_X53497_1398_1423_R TCACAGACCTGTT 17 ACGAGACCTTAA ATTGCCTCAAACT  88716S_X53497_236_262_F TCCCGGGTGATTCA  4 18S_X53497_328_350_RTGCGACCATGGTA 18 TAATAACTTCTCG GGCCTCTATC  888 25S_X70659_2472_2496_FTTGTAGAATAAGTG  5 25S_X70659_2600_2624_R TTCCCCACCTGAC 19 GGAGCTTCGGCAATGTCTTCAAC  889 25S_X70659_2472_2496P _F TTGTAGAATAAGTG  525S_X70659_2600_2624P _R TTCCCCACCTGAC 19 GGAGCTpTpCpGGC AATGTCTpTpCpAAC  890 25S_X70659_966_1022_F TCTCAGGATAGCAG  6 25S_X70659_1108_1132_RTCGCCCACGTCCA 20 AAACTCGTATCAG ATTAAGTAACAA  891 25S_X70659_698_723_FTCCGTCTAACATCT  7 25S_X70659_807_834_R TCAGCTATGCTCT 21 ATGCGAGTGTTTTACTCAAATCCAT CC  892 25S_X70659_134_159_F TGTGAAGCGGCAAA  825S_X70659_247_269_R TCACGGGATTCTC 22 AGCTCAAATTTG ACCCTCTGTG  89425S_X70659_134_159_F TGTGAAGCGGCAAA  8 25S_X70659_235_258_RTCACCCTCTATGA 23 AGCTCAAATTTG CGCCCTATTCC  893 25S_X70659_134_159_FTGTGAAGCGGCAAA  8 25S_X70659_247_269P_R TCACGGGATTCTC 22 AGCTCAAATTTGACpCpCpTCTGTG  895 25S_X70659_134_159P_F TGTGAAGCGGCAAA  825S_X70659_235_258P_R TCACpCpCpTCTA 23 AGCpTpCpAAATpT TGACGCCCTATpT pTGpCC 3029 25SCANDIDA_X70659_996_1022_F TCTCAGGATAGCAG  923SCANDIDA_X70659_1104_1129_R TCCACGTTCAATT 24 AAGCTCGTATCAGAAGCAACAAGGAC 3030 25SFUNG_X70659_134_158_F TGTGAAGCGGCAAA 1023SFUNG_X70659_235_261_R TTCTCACCCTCTG 25 AGCTCAAATTT TGACGGCCTGTT CC3031 25SFUNG_X70659_697_722_F TGGAGTCTAACATC 11 23SFUNG_X70659_808_834_RTCAGCTATGCTCT 26 TATGCGAGTGTT TACTCAAATCCA TC 303225SFUNG_X70659_2472_2496_F TTGTAGAATAGGTG 12 23SFUNG_X70659_2593_2615_RTGACAATGTCTTC 27 GGAGCTTCGGC AACCCGGATC

Example 2 Sample Preparation and PCR

Samples were processed to obtain viral genomic material using a QiagenQIAamp Virus BioRobot MDx Kit. Resulting genomic material was amplifiedusing an Eppendorf thermal cycler and the amplicons were characterizedon a Bruker Daltonics MicroTOF instrument. The resulting data wasanalyzed using GenX software (SAIC, San Diego, Calif. and Ibis,Carlsbad, Calif.).

All PCR reactions were assembled in 50 microliter reaction volumes in a96-well microtiter plate format using a Packard MPII liquid handlingrobotic platform and M.J. Dyad thermocyclers (MJ research, Waltham,Mass.). The PCR reaction mixture consisted of 4 units of Amplitaq Gold,1× buffer II (Applied Biosystems, Foster City, Calif.), 1.5 mMMgCl.sub.2, 0.4 M betaine, 800.micro.M dNTP mixture and 250 nM of eachprimer. The following typical PCR conditions were used: 95.deg.C for 10min followed by 8 cycles of 95.deg.C for 30 seconds, 48.deg.C for 30seconds, and 72.deg.C 30 seconds with the 48.deg.C annealing temperatureincreasing 0.9.deg.C with each of the eight cycles. The PCR was thencontinued for 37 additional cycles of 95.deg.C for 15 seconds, 56.deg.Cfor 20 seconds, and 72.deg.C 20 seconds.

Example 3 Solution Capture Purification of PCR Products for MassSpectrometry with Ion Exchange Resin-Magnetic Beads

For solution capture of nucleic acids with ion exchange resin linked tomagnetic beads, 25.micro.l of a 2.5 mg/mL suspension of BioClone amineterminated superparamagnetic beads were added to 25 to 50.micro.l of aPCR (or RT-PCR) reaction containing approximately 10 pM of a typical PCRamplification product. The above suspension was mixed for approximately5 minutes by vortexing or pipetting, after which the liquid was removedafter using a magnetic separator. The beads containing bound PCRamplification product were then washed three times with 50 mM ammoniumbicarbonate/50% MeOH or 100 mM ammonium bicarbonate/50% MeOH, followedby three more washes with 50% MeOH. The bound PCR amplicon was elutedwith a solution of 25 mM piperidine, 25 mM imidazole, 35% MeOH whichincluded peptide calibration standards.

Example 4 Mass Spectrometry and Base Composition Analysis

The ESI-FTICR mass spectrometer is based on a Bruker Daltonics(Billerica, Mass.) Apex II 70e electrospray ionization Fourier transformion cyclotron resonance mass spectrometer that employs an activelyshielded 7 Tesla superconducting magnet. The active shielding constrainsthe majority of the fringing magnetic field from the superconductingmagnet to a relatively small volume. Thus, components that might beadversely affected by stray magnetic fields, such as CRT monitors,robotic components, and other electronics, can operate in closeproximity to the FTICR spectrometer. All aspects of pulse sequencecontrol and data acquisition were performed on a 600 MHz Pentium II datastation running Bruker's Xmass software under Windows NT 4.0 operatingsystem. Sample aliquots, typically 15 μl, were extracted directly from96-well microtiter plates using a CTC HTS PAL autosampler (LEAPTechnologies, Carrboro, N.C.) triggered by the FTICR data station.Samples were injected directly into a 10.micro.l sample loop integratedwith a fluidics handling system that supplies the 100.micro.l/hr flowrate to the ESI source. Ions were formed via electrospray ionization ina modified Analytica (Branford, Conn.) source employing an off axis,grounded electrospray probe positioned approximately 1.5 cm from themetalized terminus of a glass desolvation capillary. The atmosphericpressure end of the glass capillary was biased at 6000 V relative to theESI needle during data acquisition. A counter-current flow of dryN.sub.2 was employed to assist in the desolvation process. Ions wereaccumulated in an external ion reservoir comprised of an rf-onlyhexapole, a skimmer cone, and an auxiliary gate electrode, prior toinjection into the trapped ion cell where they were mass analyzed.Ionization duty cycles greater than 99% were achieved by simultaneouslyaccumulating ions in the external ion reservoir during ion detection.Each detection event consisted of 1M data points digitized over 2.3 s.To improve the signal-to-noise ratio (S/N), 32 scans were co-added for atotal data acquisition time of 74 s.

The ESI-TOF mass spectrometer is based on a Bruker DaltonicsMicroTOF.sup.TM device (Bruker Daltonics, Billerica, Mass.). Ions fromthe ESI source undergo orthogonal ion extraction and are focused in areflectron prior to detection. The TOF and FTICR are equipped with thesame automated sample handling and fluidics described above. Ions areformed in the standard MicroTOF.sup.TM ESI source that is equipped withthe same off-axis sprayer and glass capillary as the FTICR ESI source.Consequently, source conditions were the same as those described above.External ion accumulation was also employed to improve ionization dutycycle during data acquisition. Each detection event on the TOF wascomprised of 75,000 data points digitized over 75.micro.s.

The sample delivery scheme allows sample aliquots to be rapidly injectedinto the electrospray source at high flow rate and subsequently beelectrosprayed at a much lower flow rate for improved ESI sensitivity.Prior to injecting a sample, a bolus of buffer was injected at a highflow rate to rinse the transfer line and spray needle to avoid samplecontamination/carryover. Following the rinse step, the autosamplerinjected the next sample and the flow rate was switched to low flow.Following a brief equilibration delay, data acquisition commenced. Asspectra were co-added, the autosampler continued rinsing the syringe andpicking up buffer to rinse the injector and sample transfer line. Ingeneral, two syringe rinses and one injector rinse were required tominimize sample carryover. During a routine screening protocol a newsample mixture was injected every 106 seconds. More recently a fast washstation for the syringe needle has been implemented which, when combinedwith shorter acquisition times, facilitates the acquisition of massspectra at a rate of just under one spectrum/minute.

Raw mass spectra were post-calibrated with an internal mass standard anddeconvoluted to monoisotopic molecular masses. Unambiguous basecompositions were derived from the exact mass measurements of thecomplementary single-stranded oligonucleotides. Quantitative results areobtained by comparing the peak heights with an internal PCR calibrationstandard present in every PCR well at 500 molecules per well.Calibration methods are commonly owned and disclosed in InternationalApplication WO 2005/094421, which is incorporated herein by reference inentirety.

Example 5 De Novo Determination of Base Composition of AmplificationProducts Using Molecular Mass Modified Deoxynucleotide Triphosphates

Because the molecular masses of the four natural nucleobases have arelatively narrow molecular mass range (A=313.058, G=329.052, C=289.046,T=304.046—See Table 3), a persistent source of ambiguity in assignmentof base composition can occur as follows: two nucleic acid strandshaving different base composition may have a difference of about 1 Dawhen the base composition difference between the two strands is G⇄A(−15.994) combined with C⇄T (+15.000). For example, one 99-mer nucleicacid strand having a base composition of A27 G30 C21 T21 has atheoretical molecular mass of 30779.058 while another 99-mer nucleicacid strand having a base composition of A26 G31 C22 T20 has atheoretical molecular mass of 30780.052. A 1 Da difference in molecularmass may be within the experimental error of a molecular massmeasurement and thus, the relatively narrow molecular mass range of thefour natural nucleobases imposes an uncertainty factor. The 1 Dauncertainty factor is eliminated through amplification of a nucleic acidwith one mass-tagged nucleobase and three natural nucleobases.

Addition of significant mass to one of the 4 nucleobases (dNTPs) in anamplification reaction, or in the primers themselves, will result in asignificant difference in mass of the resulting amplification product(significantly greater than 1 Da) arising from ambiguities arising fromthe G⇄A combined with C⇄T event (Table 3). Thus, the same the G⇄A(−15.994) event combined with 5-Iodo-C⇄T (−110.900) event would resultin a molecular mass difference of 126.894. If the molecular mass of thebase composition A27 G30 5-Iodo-C21 T21 (33422.958) is compared with A26G31 5-Iodo-C22 T20, (33549.852) the theoretical molecular massdifference is +126.894. The experimental error of a molecular massmeasurement is not significant with regard to this molecular massdifference. Furthermore, the only base composition consistent with ameasured molecular mass of the 99-mer nucleic acid is A27 G30 5-Iodo-C21T21. In contrast, the analogous amplification without the mass tag has18 possible base compositions.

TABLE 3 Molecular Masses of Natural Nucleobases and the Mass-ModifiedNucleobase 5-Iodo-C and Molecular Mass Differences Resulting fromTransitions Nucleobase Molecular Mass Transition Δ Molecular Mass A313.058 A-->T −9.012 A 313.058 A-->C −24.012 A 313.058 A-->5-Iodo-C101.888 A 313.058 A-->G 15.994 T 304.046 T-->A 9.012 T 304.046 T-->C−15.000 T 304.046 T-->5-Iodo-C 110.900 T 304.046 T-->G 25.006 C 289.046C-->A 24.012 C 289.046 C-->T 15.000 C 289.046 C-->G 40.006 5-Iodo-C414.946 5-Iodo-C-->A −101.888 5-Iodo-C 414.946 5-Iodo-C-->T −110.9005-Iodo-C 414.946 5-Iodo-C-->G −85.894 G 329.052 G-->A −15.994 G 329.052G-->T −25.006 G 329.052 G-->C −40.006 G 329.052 G-->5-Iodo-C 85.894

Mass spectra of bioagent-identifying amplicons were analyzedindependently using a maximum-likelihood processor, such as is widelyused in radar signal processing. This processor, referred to as GenX,first makes maximum likelihood estimates of the input to the massspectrometer for each primer by running matched filters for each basecomposition aggregate on the input data. This includes the GenX responseto a calibrant for each primer.

The algorithm emphasizes performance predictions culminating inprobability-of-detection versus probability-of-false-alarm plots forconditions involving complex backgrounds of naturally occurringorganisms and environmental contaminants. Matched filters consist of apriori expectations of signal values given the set of primers used foreach of the bioagents. A genomic sequence database is used to define themass base count matched filters. The database contains the sequences ofknown bacterial bioagents and includes threat organisms as well asbenign background organisms. The latter is used to estimate and subtractthe spectral signature produced by the background organisms. A maximumlikelihood detection of known background organisms is implemented usingmatched filters and a running-sum estimate of the noise covariance.Background signal strengths are estimated and used along with thematched filters to form signatures which are then subtracted. Themaximum likelihood process is applied to this “cleaned up” data in asimilar manner employing matched filters for the organisms and arunning-sum estimate of the noise-covariance for the cleaned up data.

The amplitudes of all base compositions of bioagent-identifyingamplicons for each primer are calibrated and a final maximum likelihoodamplitude estimate per organism is made based upon the multiple singleprimer estimates. Models of all system noise are factored into thistwo-stage maximum likelihood calculation. The processor reports thenumber of molecules of each base composition contained in the spectra.The quantity of amplification product corresponding to the appropriateprimer set is reported as well as the quantities of primers remainingupon completion of the amplification reaction.

Base count blurring can be carried out as follows. “Electronic PCR” canbe conducted on nucleotide sequences of the desired bioagents to obtainthe different expected base counts that could be obtained for eachprimer pair. See for example, Schuler, Genome Res. 7:541-50, 1997. Inone illustrative embodiment, one or more spreadsheets, such as MicrosoftExcel workbooks contain a plurality of worksheets. First in thisexample, there is a worksheet with a name similar to the workbook name;this worksheet contains the raw electronic PCR data. Second, there is aworksheet named “filtered bioagents base count” that contains bioagentname and base count; there is a separate record for each strain afterremoving sequences that are not identified with a genus and species andremoving all sequences for bioagents with less than 10 strains. Third,there is a worksheet, “Sheet1” that contains the frequency ofsubstitutions, insertions, or deletions for this primer pair. This datais generated by first creating a pivot table from the data in the“filtered bioagents base count” worksheet and then executing an ExcelVBA macro. The macro creates a table of differences in base counts forbioagents of the same species, but different strains. One of ordinaryskill in the art may understand additional pathways for obtainingsimilar table differences without undo experimentation.

Application of an exemplary script, involves the user defining athreshold that specifies the fraction of the strains that arerepresented by the reference set of base counts for each bioagent. Thereference set of base counts for each bioagent may contain as manydifferent base counts as are needed to meet or exceed the threshold. Theset of reference base counts is defined by taking the most abundantstrain's base type composition and adding it to the reference set andthen the next most abundant strain's base type composition is addeduntil the threshold is met or exceeded. The current set of data wasobtained using a threshold of 55%, which was obtained empirically.

For each base count not included in the reference base count set forthat bioagent, the script then proceeds to determine the manner in whichthe current base count differs from each of the base counts in thereference set. This difference may be represented as a combination ofsubstitutions, Si=Xi, and insertions, Ii=Yi, or deletions, Di=Zi. Ifthere is more than one reference base count, then the reporteddifference is chosen using rules that aim to minimize the number ofchanges and, in instances with the same number of changes, minimize thenumber of insertions or deletions. Therefore, the primary rule is toidentify the difference with the minimum sum (Xi+Yi) or (Xi+Zi), e.g.,one insertion rather than two substitutions. If there are two or moredifferences with the minimum sum, then the one that will be reported isthe one that contains the most substitutions.

Differences between a base count and a reference composition arecategorized as one, two, or more substitutions, one, two, or moreinsertions, one, two, or more deletions, and combinations ofsubstitutions and insertions or deletions. The different classes ofnucleobase changes and their probabilities of occurrence have beendelineated in U.S. Patent Application Publication No. 2004209260, whichis incorporated herein by reference in entirety.

Example 6 Codon Base Composition Analysis—Assay Development

The information obtained by the codon analysis method is basecomposition. While base composition is not as information-rich assequence, it can have the same practical utility in many situations. Thegenetic code uses all 64 possible permutations of four differentnucleotides in a sequence of three, where each amino acid can beassigned to as few as one and as many as six codons. Since basecomposition analysis can only identify unique combinations, withoutdetermining the order, one might think that it would not be useful ingenetic analysis. However, many problems of genetic analysis start withinformation that constrains the problem. For example, if there is priorknowledge of the biological bounds of a particular genetic analysis, thebase composition may provide all the necessary and useful information.If one starts with prior knowledge of the starting sequence, and isinterested in identifying variants from it, the utility of basecomposition depends upon the codons used an the amino acids of interest.

Analysis of the genetic code reveals three situations, illustrated inTables 4A-C. In Table 4A, where the leucine codon CTA is comprised ofthree different nucleotides, each of the nine possible single mutationsare always identifiable using base composition alone, and result ineither a “silent” mutation, where the amino acid is not changed, or anunambiguous change to another specific amino acid. Irregardless, theresulting encoded amino acid is known, which is equivalent to theinformation obtained from sequencing. In Table 4B, where two of thethree nucleotides of the original codon are the same, there is a loss ofinformation from a base composition measurement compared to sequencing.In this case, three of the nine possible single mutations produceunambiguous amino acid choices, while the other six each produce twoindistinguishable options. For example, if starting with thephenylalanine codon TTC, then either one of the two Ts could change toA, and base composition analysis could not distinguish a first positionchange from a second position change. A first position change of T to Awould encode an isoleucine and a second position change of T to A wouldencode a tyrosine. However no other options are possible and the valueof the information would depend upon whether distinguishing an encodedisoleucine from a tyrosine was biologically important. In Table 4C, allthree positions have the same nucleotide, and therefore the ambiguity inamino acid identity is increased to three possibilities. Out of 64 codonchoices, 20 have three unique nucleotides (as in Table 4A), 40 have twoof the same and one different nucleotide (as in Table 4B) and 4 have thesame nucleotide in all three positions (as in Table 4C).

TABLE 4A Wild Type Codon with Three Unique Nucleobases Codon Codon BaseDescription Codon(s) Composition Amino Acid Coded WILD TYPE CODON CTAA1C1T1 Leu Single Mutation ATA A2T1 Ile Single Mutation GTA A1G1T1 ValSingle Mutation TTA A1T2 Leu Single Mutation CAA A1C2 Gln SingleMutation CGA A1G1C1 Arg Single Mutation CCA A1C2 Pro Single Mutation CTGG1C1T1 Leu Single Mutation CTC C2T1 Leu Single Mutation CTT C1T2 Leu

TABLE 4B Wild Type Codon with Two Unique Nucleobases Codon Codon BaseDescription Codon(s) Composition Amino Acid Coded WILD TYPE CODON TTCC1T2 Phe Single Mutations ATC, TAC A1C1T1 Ile, Tyr Single Mutations GTC,TGC G1C1T1 Val, Cys Single Mutations CTC, TCC C2T1 Leu, Ser SingleMutation TTA A1T2 Leu Single Mutation TTG G1T2 Leu Single Mutation TTTT3 Phe

TABLE 4C Wild Type Codon Having Three of the Same Nucleobase Codon CodonBase Amino Acid Description Codon(s) Composition Coded WILD TYPE CODONTTT T3 Phe Single Mutations ATT, TAT, TTA A1T2 Ile, Tyr, Leu SingleMutations GTT, TGT, TTG G1T2 Val, Cys, Leu Single Mutations CTT, TCT,TTC C1T2 Leu, Ser, Phe

Example 7 Identification of Fungi

Primer pair numbers 3029, 3030, 3031 and 3032 were tested in actual PCRreactions using either 500 pg or 50 pg of purified fungal DNA from eachof the nine fungal species shown in Tables 5A and 5B. PCR reactions wereperformed using purified fungal DNA in the presence of 1.6.micro.g ofhuman DNA (from whole blood) per reaction (an excess ratio of human tofungal DNA of 3200:1 and 32000:1, respectively). Reactions were desaltedand analyzed by electrospray mass spectrometry. Resulting masses werecompared to a database of molecular masses and corresponding basecompositions of fungal bioagent identifying amplicons populated byselection of rRNA nucleic acid segments of fungal species from GenBank.All species were differentiated from one another by their basecompositions using any combination of two primer pairs from the group,or primer pair 3032 alone. For example, shown in FIG. 3 are overlaidmass spectra and base compositions of amplification productscorresponding to nine fungal bioagent identifying amplicons (basecompositions are shown for the sense strand of each amplificationproduct). PCR reactions in the presence of 1.6.micro.g human DNA yieldedresults comparable to those with fungal DNA alone. Base compositiondeterminations from reactions performed on 50 pg of target DNA wereconsistent with results obtained with 500 pg of target DNA, both in theabsence and presence of human DNA. The ability to amplify anddifferentiate multiple species from different phyla with a small numberof primer pairs in a standardized platform will provide high value in aclinical setting. A typical real-time assay requires a probe to beconfigured specifically to each specific target species (or isolate insome cases). For example, whereas nine specific probes may be requiredto differentiate the nine species indicated in Tables 5A and 5B, primerpair 3032 alone, or any combination of two of the other four primerpairs, was found to be sufficient to identify these nine species.

It is particularly important to note that it is not necessary that thenucleic acid sequence of the fungus be known in order for anamplification product to be produced and identified as a fungus withsimilarities to known fungi. For several entries in Tables 5A and 5B,which provides base compositions for nine species of fungi produced withprimer pairs 3029, 3030, 3031 and 3032, neither the expected basecomposition nor primer target site was known directly from sequence dataat the time the primers were configured. For example, sequence data wasnot available for Candida kefyr. Provided that enough is known aboutnear neighbors to a target organism, primers that broadly cover membersof a phylogenetically-related group of organisms can be configured togenerate an amplification product that, once analyzed by massspectrometry, carries much more information than just the presence orabsence of a product. The combination of four base compositions of theamplification products corresponding to bioagent identifying ampliconsof Candida kefyr obtained using primer pair numbers 3029, 3030, 3031 and3032 is distinct from the analogous combinations of base compositions ofall other species tested, even though the expected compositions fromthat species were not known beforehand. Thus, the full sequence of apathogen does not need to be known in order to differentiate it fromknown organisms using this embodiment. Shown in FIG. 4 is a threedimensional binary base composition diagram indicating binary basecompositions for the nine amplification products obtained with primerpair number 3030, which indicates separation of base compositions inthree dimensional space.

TABLE 5A Base Compositions of Fungal Bioagent Identifying Amplicons ofNine Species of Fungi Amplified with Primer Pair Numbers 3029 and 3030Fungus Primer Pair 3029 Primer Pair 3030 Aspergillus fumigatus notdetermined A29 G41 C31 T26 Malassezia pachydermatis A40 G31 C21 T42 A30G39 C28 T30 Cryptococcus albidus A42 G30 C21 T41 A34 G36 C28 T31Cryptococcus laurentii Did not prime A31 G40 C28 T28 Candidaparapsilosis A42 G30 C18 T44 A32 G36 C21 T39 Candida tropicalis A40 G33C20 T41 A32 G36 C21 T39 Candida kefyr A41 G32 C20 T41 A32 G37 C24 T35Candida glabrata A41 G32 C20 T41 A32 G36 C24 T36 Candida albicans A42G30 C18 T44 A30 G38 C24 T36

TABLE 5B Base Compositions of Fungal Bioagent Identifying Amplicons ofNine Species of Fungi Amplified with Primer Pair Numbers 3031 and 3032Fungus Primer Pair 3031 Primer Pair 3032 Aspergillus fumigatus A32 G46C34 T29 A31 G38 C35 T44 Malassezia pachydermatis Did not prime A30 G41C35 T42 Cryptococcus albidus A36 G41 C26 T33 A32 G38 C32 T46Cryptococcus laurentii A37 G41 C25 T33 A31 G39 C32 T46 Candidaparapsilosis A37 G40 C25 T38 A34 G39 C30 T42 Candida tropicalis A34 G44C25 T35 A35 G37 C29 T44 Candida kefyr A36 G43 C25 T34 A38 G33 C30 T48Candida glabrata A34 G45 C28 T37 A36 G34 C30 T50 Candida albicans A36G44 C24 T34 A35 G37 C31 T41

The present invention includes any combination of the various speciesand subgeneric groupings falling within the generic disclosure. Thisinvention therefore includes the generic description of the inventionwith a proviso or negative limitation removing any subject matter fromthe genus, regardless of whether or not the excised material isspecifically recited herein.

While in accordance with the patent statutes, description of the variousembodiments and examples have been provided, the scope of the inventionis not to be limited thereto or thereby. Modifications and alterationsof the present invention will be apparent to those skilled in the artwithout departing from the scope and spirit of the present invention.

Therefore, it will be appreciated that the scope of this invention is tobe defined by the appended claims, rather than by the specific exampleswhich have been presented by way of example.

Each reference (including, but not limited to, journal articles, U.S.and non-U.S. patents, patent application publications, internationalpatent application publications, gene bank accession numbers, internetweb sites, and the like) cited in the present application isincorporated herein by reference in its entirety.

1. A method for identification of a fungus in a sample comprising:amplifying nucleic acid from said fungus using an isolatedoligonucleotide primer pair wherein each of the forward member andreverse member of the primer pair is independently 13 to 35 consecutivenucleobases in length and configured to hybridize with at least 70%complementarity to a region of GenBank Accession Number X70659, saidregion being from nucleobase 134 to nucleobase 269, to obtain anamplification product that comprises a length from 45-200 consecutivenucleobases.
 2. The method of claim 1 further comprising the step ofdetermining a molecular mass of said amplification product.
 3. Themethod of claim 2 further comprising the step of calculating a basecomposition from said molecular mass.
 4. The method of claim 2 furthercomprising the step of comparing said determined molecular mass with adatabase of molecular masses indexed to primer pairs and known fungibioagents, wherein a match between said determined molecular mass and amolecular mass in said database indicates the presence of said fungus insaid sample.
 5. The method of claim 2 further comprising the step ofcomparing said determined molecular mass with a database of molecularmasses indexed to primer pairs and known fungi bioagents, wherein amatch between said determined molecular mass and a molecular mass insaid database identifies the species or sub-species of said fungus insaid sample.
 6. The method of claim 5 wherein said fungus in said sampleis identified as a species of fungus or a sub-species of fungus.
 7. Themethod of claim 3 further comprising the step of comparing saidcalculated base composition with a database of base compositions indexedto primer pairs and known fungi bioagents, wherein a match between saidcalculated base composition and a base composition in said databaseindicates the presence of said fungus in said sample.
 8. The method ofclaim 3 further comprising the step of comparing said calculated basecomposition with a database of base compositions indexed to primer pairsand known fungi bioagents, wherein a match between said calculated basecomposition and a base composition in said database 0 identifies thespecies or sub-species of said fungus in said sample.
 9. The method ofclaim 8 wherein said fungus in said sample is identified as a species offungus or a sub-species of fungus.
 10. The method of claim 1 whereinsaid forward primer member of said primer pair hybridizes with at least70% complementarity to a region of GenBank Accession Number X70659, saidregion being from nucleobase 134 to nucleobase
 159. 11. The method ofclaim 1 wherein said forward primer member is SEQ ID NO:
 10. 12. Themethod of claim 1 wherein said reverse primer member of said primer pairhybridizes with at least 70% complementarity to a region of GenBankAccession Number X70659, said region being from nucleobase 235 tonucleobase
 269. 13. The method of claim 1 wherein said reverse primermember is SEQ ID NO:
 25. 14. The method of claim 1 wherein said forwardmember and said reverse member are each independently configured tohybridize with at least 80% complementarity to a region of GenBankAccession Number X70659.
 15. The method of claim 1 wherein said forwardmember and said reverse member are each independently configured tohybridize with at least 90% complementarity to a region of GenBankAccession Number X70659.
 16. The method of claim 1 wherein said forwardmember and said reverse member are each independently configured tohybridize with at least 95% complementarity to a region of GenBankAccession Number X70659.
 17. The method of claim 1 wherein said forwardmember and said reverse member are each independently configured tohybridize with 100% complementarity to a region of GenBank AccessionNumber X70659.
 18. The method of claim 1 wherein at least one of saidforward member and said reverse member comprises at least one modifiednucleobase.
 19. The method of claim 18 wherein at least one of said atleast one modified nucleobase is a mass modified nucleobase.
 20. Themethod of claim 19 wherein said mass modified nucleobase is 5-Iodo-C.21-69. (canceled)